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Parsing  algorithms  are  developed  such  that  syntactic  tree-height  is 
minimized,  or  reduced,  with  respect  to  application  of  the  associative, 
commutative,  and  distributive  (but  not  factoring)  laws  of  arithmetic, 
on  arithmetic  expressions  composed  of  well-formed  sequences  of  the 
symbols  add,  subtract,  multiply,  divide,  scalar  identifier,  and  assign- 
ment, where  it  is  assumed  that  a  unique  weight  (i.e.,  program  execution 
time)  may  be  associated  with  each  symbol.   A  parsing  algorithm  is  also 
developed  such  that  syntactic  tree-height  is  minimized,  with  respect  to 
application  of  the  associative  law,  on  expressions  composed  of  a  con- 
formable sequence  of  matrix  products,  where  the  matrices  are  not  nec- 
essari  ly  square,  and  such  that  the  overall  number  of  computer  opera- 
tions is  minimized  if  tree-height  is  not  affected. 

A  new  non-preemptive  scheduling  algorithm  of  a  weighted-node 
acyclic  dependency  graph,  having   n  nodes,  on  a  system  of   k  equi potent 

machines  is  developed  such  that  all  nodes  are  processed  in  the  least 

2 

amount  of  time.   The  assignment  process  requires  0(n  )   computer  oper- 
ations.  A  new  algorithm  to  determine  k*  =  LUB(k)  ,  such  that  the  graph 
may  be  processed  in  the  critical  time,  is  also  presented.    Finally,  it  . 
is  shown  how  to  optimally  schedule  a  system  of  m  special  purpose 
machines,  where  there  are  k.   equi potent  machines  of  each  type,  on  a 
graph. 
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I .   I NTRODUCT I  ON 

I.I.   History  of  Parallel  Computation 

Although  Charles  Babbage  first  proposed  devices  which  would 
achieve  parallel  computation  nearly  150  years  ago  in  his  Analytic 
Engine  (multiple  arithmetic  units  and  memory  look-ahead  (25)),  until 
quite  recently  computer  manufacturers  have  been  content  to  build 
machines  which  run  in  a  basically  sequential  manner.   They  still  exhibit 
a  large  amount  of  lethargy,  notably  IBM,  in  the  development  of  parallel 
processing  systems  even  though  the  computer  community  has  recognized 
the  need  to  build  computer  systems  with  processing  power  which  ex- 
ceeds that  possible  by  component  speed-up  advances.   We  have  progressed 
from  roller  bearings  and  gears  to  transistors  and  our  computers  are 
still  not  fast  enough  to  satisfy  our  needs.  However,  the  absence  of 
parallelism  is  not  to  be  wondered  at;  up  until  now,  the  expertise  to 
exploit  parallelism  efficiently  has  simply  not  been  available.   It 
is  expected  that  the  contents  of  this  paper  will  shed  new  light  on 
the  utilization  of  parallel  computing  systems. 

During  the  past  decade  the  first  steps  in  parallel  computation 
have  been  taken  by  Burroughs  Corporation  and  by  Control  Data  Corpora- 
tion. Their  reasons  for  doing  so  were,  however,  quite  different. 
Burroughs  accurately  recognized  that  in  the  conventional  processing 
system  (sequential)  a  large  percentage  of  the  machine  at  any  instant 
of  time  was  idle  and  performing  no  useful  work.   In  order  to  increase 
subsystem  utilization,  and  thus  increase  cost-effectiveness,  Burroughs 


proposed  and  built  the  B-5000  in  I960  which  operated  in  a  multi- 
programming/multiprocessing mode.  Control  Data,  on  the  other  hand, 
was  not  concerned  with  system  cost,  but  with  system  speed.  They  de- 
veloped a  machine  which  could  perform  several  arithmetic  operations 
as  well  as  memory  operations  simultaneously,  with  an  ad  hoc  para  I  lei  ism 
detection  device  controlling  the  functional  units  operating  in  parallel, 
and  thus  at  least  double  the  amount  of  work  performed  over  that  of  a 
comparable  conventional  system.  This  design  concept  also  increased 
the  cost-effectiveness.   The  CDC  6600  was  introduced  about  1965. 
While  the  market  place  for  these  systems  is  quite  different,  let  us 
place  them  in  the  "general  purpose  computer"  class,  for  the  design 
concepts  have  since  been  copied  by  many  other  manufacturers. 

We  should  also  mention  a  very  special  kind  of  parallel  computer 
system  class  which  may  be  available  for  the  first  time  this  year 
(1972).   The  Control  Data  STAR  computer,  a  pipe- line  vector  machine, 
and  the  University  of  Illinois  ILLIAC  IV  (built  by  Burroughs),  an 
array  of  uniform  processors,  are  capable  of  performing  a  large  number 
of  identical  operations  (e.g.,  add)  simultaneously.  Texas  Instruments 
is  also  building  an  Advanced  Scientific  Computer  (ASC),  which  is  re- 
markably similar  to  the  STAR  computer.   While  these  machines  perform 
admirably  on  a  special  class  of  problems,  their  parallelism  is  not 
easily  exploited  if  the  processing  algorithms  do  not  require  many 
simultaneous  operations  of  the  same  type,  or  if  the  input  data  is  not 
well-ordered,  although  storage  techniques  have  been  developed  to  amel- 
iorate the  latter  problem  (19). 


Despite  this  recent  appearance  of  concurrent  processing,  none  of 
the  computer  manufacturers,  or  users,  has  been  successful  in  exploiting 
the  existing,  or  proposed,  machines  to  achieve  the  parallel  computa- 
tion inherent  in  the  machine  design.   It  is  still  necessary  for  some 
clever  programmer,  applying  some  heuristic  algorithms  which  are  intu- 
itive only  to  him,  to  design  a  program  which  exploits  the  parallelism. 

1.2.   Scope 

This  paper  attacks  two  small,  but  not  insignificant,  problems  in 
the  field  of  parallel  computation.   Firstly,  we  show  how  to  modify  and 
parse  an  arithmetic  expression  such  that  syntactic  tree-height  is  min- 
imized, given  that  the  computer  operations  (e.g.,  fetch,  add,  multiply, 
...)  each  require  a  specific  amount  of  time,  not  necessarily  equal, 
during  program  execution.  Other  investigators  have  solved  the  tree- 
height  problem  assuming  unit-time  computer  operations,  but  this  is  the 
first  solution  based  on  weighted  operations.  Clearly,  syntactic  tree- 
height  tells  us  how  quickly  an  expression  may  be  evaluated  by  a  system 
of  parallel  processors,  given  a  sufficient  number  of  processors.   But 
what  do  we  mean  by  "sufficient  number  of  processors"  and  furthermore, 
what  is  the  best  way  to  use  the  processors  to  evaluate  an  expression 
in  the  minimum  possible  time?  This  is  the  second  problem  solved  in  this 
paper. 

We  show  how  to  schedule  a  system  of  equi potent  processors,  or 
machines,  on  a  job,  which  may  be  described  not  only  by  a  weighted-node 


tree,  but  also  by  a  weigh ted- node,  acyclic  directed  graph,  where  the 
scheduling  is  not  preemptive,  such  that  the  job  is  completed  in  the 
least  amount  of  time.   (We  leave  the  problem  of  finding  common  sub- 
expressions, and  thus  changing  a  syntactic  tree  to  a  graph  to  inves- 
tigators such  as  Breuer  (5).)   The  scheduling  algorithm  is  new,  but 
not  the  first  algorithm  to  optimally  schedule  a  sufficient,  or  insuffic- 
ient, number  of  machines  on  a  weighted-node  graph.  The  scheduling 

2 
algorithm  requires  0(n  )   computer  operations,  where  n   is  the  num- 
ber of  nodes,  to  make  the  assignments.   We  also  show,  in  a  new  way, 
how  to  determine  a  lower  bound  on  the  number  of  machines  required  in 
order  to  process  the  job  in  the  critical  time  of  the  graph.   Finally 
we  show  how  to  optimally  schedule  a  system  of  special  purpose  machines, 
with  one  or  more  machines  of  each  type,  on  a  weighted-node  graph. 

We  solve  the  problem  of  tree-height  minimization  in  the  following 
way.   All  arithmetic  expressions,  except  those  including  a  sequence 
of  matrix  products,  are  analyzed  by  application  of  the  distributive 
law  of  arithmetic  (but  distribution  is  made  only  if  tree-height  is 
reduced),  and  the  associative  and  commutative  laws;  a  sequence  of  matrix 
products,  where  the  matrices  are  not  necessarily  square,  is  analyzed 
by  application  of  the  associative  law  of  arithmetic.   This  is  also  the 
first  solution  to  the  tree-height  minimization  problem  of  the  product 
of  a  sequence  of  non-square  matrices. 

The  term  "minimize"  in  the  context  of  tree-height,  has  a  very  spec- 
ial meaning  in  this  paper.  Within  the  constraints  of  application  of  the 


distributive,  associative,  and  commutative  laws  only,  "minimize" 

has  the  usual  meaning.   Factoring  is  a  valid  arithmetic  operation 

(i.e.,  the  converse  of  the  distributive  law);   however,  since  factoring 

is  a  fairly  complicated  process,  it  is  not  included  in  our  analysis 

pertaining  to  tree-height  reduction.  Hence,  if  we  were  to  permit  the 

application  of  a  factoring  procedure  also,  then  what  we  have  found  to 

be  "minimal"  is  in  fact  not.   For  example,  Pan  (30)  has  given  a  lower 

bound  on  the  syntactic  tree-height  of  an  nth  degree  polynomial, 

P  (x)  .  The  lower  bound  is  probably  only  achievable  when  P  (x)   is 
n  k      r     x  n 

written  in  factored  form,  viz: 

P  (x)  =  (x  -  x.)(x  -  x0)...(x  -  x  )  . 
n  l       z         n 

In  the  syntactic  tree  of  this  expression,   n  subtractions  could  be 
performed  simultaneously  to  evaluate  all  factors,  and  then  the  multi- 
plications could  be  performed  in     log^(n-l)      steps  to  complete 

the  evaluation  of   P  (x)  .   Muraoka  (28)  and  Maruyana  (24)  have  found 

n  y 

methods  which  were  redundantly  found  by  Munro  and  Paterson  (26),  to 
compute  polynomials  such  that  syntactic  tree-height  is  reduced  without 
the  necessity  of  finding  the  polynomial  roots.   However,  these  syn- 
tactic trees  are  not  achievable  from  Horner's  polynomial  form  by  the 
methods  described  in  this  paper  without  occassionally  factoring  out 
common  subexpressions. 

The  history  of  tree-height  reduction,  and  hence  parallelism  expos- 
ure, in  arithmetic  expressions  is  not  slight.   Baer  and  Bovet  (2  )  and 
Muraoka  (28)  develop  methods  to  reduce  syntactic  tree-height  using  the 


three  laws  of  arithmetic  by  assuming  that  all  operations  require  unit 
time  for  evaluation.  While  Muraoka  and  Kuck  (29)  recognized  that  a 
weight  could  be  given  to  a  matrix  product,  their  tree-height  reduction 
algorithm  only  deals  with  extended  matrix  product  expressions  involving 
square  matrices  and  vectors.   It  is  believed  that  Hao  (10)  was  the 
first  to  study  the  implications  to  syntactic  tree-height  of  arithmetic 
expressions  when  multiplication  is  a  >_  I  times  as  costly  (in  time) 
as  addition. 

We  solve  the  problem  of  optimal  assignment  of  weighted  nodes 
from  a  graph  to  a  parallel  system  of  k  machines  by  judicious 
application  of  the  Critical  Path  Method  principles,  as  described  by 
Kauffman  (  18).  During  the  assignment  process,  the  critical  path  of 
the  graph  is  dynamic,  and  at  any  instant,  nodes  on  the  current  critical 
path  are  given  an  assignment  potential  greater  than  nodes  off  the  cri- 
tical path.  A  sufficient,  but  not  necessary,  condition  that  the 
algorithm  is  indeed  optimal  is  that  all  node-weights  are  of  the  same 
order  of  magnitude.   This  condition  appears  in  the  scheduling  algorithm 
as  a  one- level  look-ahead;  if  the  node-weight  range  is  more  than  an  order 
of  magnitude,  to  ensure  optimal ity  it  would  be  necessary  to  incorporate 
a  mu I ti p le- leve I  look-ahead  in  the  algorithm. 

Scheduling  problems  are  as  old  as  the  industrial  revolution.  By 
the  law  of  entropy,  our  search  for  more  efficient  machine  utilization 
has  not  diminished  with  increased  technology.   In  recent  times,  this 
interest  is  demonstrated  by  the  development  of  the  PERT  system  for  the 


U.S.  Navy's  POLARIS  project.   Johnson  (]3)  has  studied  many  classic 
models  in  the  field  of  scheduling.  Conway,  et  a  lt  (  6)  have  studied 
probabilistic  systems  and  queuing  theory.  Muntz  and  Coffman  (27)  have 
studied  pre-emptive  scheduling.   Hu  (12)  was  the  first  to  give  an  opti- 
mal scheduling  algorithm  for  a  job  described  by  a  unit-node  tree. 
Schwartz  (34)  presented  a  non-optimal  scheduling  algorithm  (but  useful 
nonetheless)  for  a  job  described  by  a  weighted-node  graph.   Ramamoorthy, 
et  a  I,  (31)  were  the  first  to  give  an  optimal  scheduling  algorithm, 
using  a  sufficient  number  of  machines,  for  a  job  described  by  a 
weighted  node  graph.  Their  results  were  developed  independently  of 
this  author,  however.   It  is  evident  that  the  interest  in  scheduling 
problems  is  alive,  and  will  certainly  continue. 

Let  us  reflect,  for  a  moment,  on  the  implications  of  the  develop- 
ments in  tree-height  reduction  and  scheduling,  as  given  in  this  paper, 
on  compiler  and  machine  organization.  Clearly,  a  source-code  program 
must  be  analyzed  at  compile  time  so  that  arithmetic  expressions  may  be 
parsed  to  minimize  syntactic  tree-height.   Similarly,  the  scheduling 
algorithm  must  be  exercised  on  the  object-code  before  run-time.  Whether 
a  machine  has  a  system  of  parallel  processors  which  are  governed  by  an 
ad  hoc  scheduler  (as  the  CDC  6600)  or  by  a  tag-directed  method  (as  the 
IBM  360/9  I ) ,  is  immateri  a  I ;  it  wou I d  be  poss  ib le  to  use  the  f u I  I  po- 
tential  parallelism  of  either  system  if  proper  analysis  is  made.   Also, 
since  it  is  now  known  how  to  utilize  a  parallel  system,  we  need  not  be 


afraid  to  build  them.   In  other  words,  we  can  study  programming  prob- 
lems and  answer  questions  like:   "How  big  should  the  parallel  system 
be,  and  how  should  the  processors  communicate  with  one  another?" 

1.3.   Reader's  Guide  to  Chapters  2,  3,  and  4 

In  order  to  facilitate  reading  this  paper  and  to  indicate  the 
development  in  each  chapter,  we  include  the  following  reader's  guide. 
Chapters  2  and  3  present  algorithms  to  minimize  syntactic  tree-height 
of  scalar  arithmetic  expressions  and  a  sequence  of  matrix  product 
expressions,  respectively.  Chapter  4  presents  the  scheduling  algo- 
ri  thm. 

The  notation  used  throughout  Chapter  2  is  presented  in  section 
2.2.  The  important  result  in  this  section  is  Theorem  2.1,  where  it  is 
proven  that  the  described  discrete  combination  method  minimizes  tree- 
height.  A  concise  and  clear  example  of  discrete  combination,  using 
the  developed  notation,  is  given  at  the  end  of  the  section.   Section 
2.3  shows  how  to  minimize  monolithic  expressions  of  the  form: 
E  =  £t.   and  E  =  Hff.  ,  and  section  2.4  shows  how  to  minimize  mono- 
lithic expressions  of  the  form  E  =  |ff./f  .  Ample  examples  are  given 
in  each  section. 

Section  2.5  is  quite  lengthy,  but  is  logically  developed.   First 
we  investigate  expressions  of  the  form  E  =  (£t.)*f  and  E  =  (£t.)/f  , 
where  the  terms  are  not  monolithic,  and  show  how  to  minimize  tree- 
height  through  distribution  of  a  monolithic  factor  f  over  the  sum  of 


terms.  We  also  prove  that  we  need  only  consider  a  monolithic  factor 
f   for  distribution.   There  are  two  conditions  for  distribution  to 
reduce  tree-height;  the  second  case  is  applicable  for  expressions  of 
the  forms  E  =  Jt.  +  (Yt!)*f  or  E  =  Yt.  +  (Yt')/f  .   We  then  show  how 
distribution  can  be  used  to  minimize  non-monolithic  expressions  of  the 
forms  E  =  £t.   and  E  =  "]~f.  .   (Note  that  "minimize"  here  is  used 

without  considering  factoring  of  common  subexpressions;  e.g.,  the 

5     6     7 
syntactic  tree-height  of  E  =  ax  +  bx  +  ex   is  larger  than  the  tree- 

f  2  5 

height  of  E  =  (a  +  bx  +  ex  )x   .)   Finally,  we  give  an  algorithm  to 

evaluate  expressions  of  the  form  E  =  (J]~f . )  /(J]"f ' )   such  that  tree- 

1      J 

height  is  reduced.   Note  that  no  claim  for  minimized  tree-height  is 
made  in  this  case;  the  numerator  and  denominator  may  be  i n  so  many 
forms  that  it  is  difficult  to  claim  minimality.   If  we  allow  subtrac- 
tion to  freely  replace  addition  in  these  expressions,  it  is  clear  that 
any  scalar  arithmetic  expression  may  be  found  among  these  presented, 
except  a  continued  fraction. 

A  sequence  of  matrix  products,  where  the  matrices  are  not  nec- 
essarily  square,  may  also  be  parsed  such  that  syntactic  tree-height  is 
minimized.   This  is  the  subject  of  Chapter  3.   In  section  3.2  we  show 
how  to  recognize  matrix  subsequences  which  reduce  to  a  scalar;  since 
scalars  commute  with  matrices,  we  remove  these  subsequences  to  obtain 
the  canonical  form  of  matrix  sequences.   In  section  3.3,  the  conditions 
for  tree-height  minimization  are  presented.   We  first  show  that  matrix 
products  are  similar  to  scalar  products  (except  that  the  commutative 
law  does  not  hold)  and  we  prove,  in  Theorem  3.1,  that  tree-height 


is  minimized  if  and  only  if  slack  is  minimized.  We  then  show  that 

when  the  sequence  is  a  product  of  three  transformations,  i.e., 

3 
E  =     Ti  ,  special  consideration  must  be  given  to  minimize  tree- 

1  =  1 
height  as  well  as  the  number  of  computer  operations.   Finally,  we  show 

how  to  associate  the  transformations  to  ensure  that  slack  is  minimized 
and  to  handle  the  special  boundary  conditions  concerning  the  transfor- 
mations at  either  end  of  the  matrix  sequence.  Section  3.4  is  the 
parsing  algorithm  which  is  based  on  the  principles  developed  in  Section 
3.3. 

In  section  4.1  of  the  scheduling  chapter,  we  lay  the  ground  rules 
for  the  type  of  oriented,  or  directed,  graph  we  consider  for  scheduling, 
In  section  4.2,  Theorem  4.1  establishes  a  lower  bound  (LB)  on  the 
number  of  machines  required  to  process  the  weighted  nodes  in  the  criti- 
cal time  of  the  graph  (i.e.,  the  longest  path  between  the  set  of 
starting  and  the  set  of  terminal  nodes).  We  also  discuss  the  number 
of  computer  operations  required  to  determine  the  LB.   In  section  4.3 
several  lemmas  establish  the  optimal ity  of  Algorithm  a  for  the 
assignment  of  the  weighted  nodes  from  the  graph  to  a  system  of  k 
parallel,  equi potent  machines.   In  section  4.4,  we  show  that  the  over- 
all number  of  computer  operations  required  to  make  the  node  assign- 

2 
ments  under  Algorithm  a  is  0(n  )  .   Section  4.5  shows  how  to  use 

Algorithm  a  from  section  4.3,  and  the  LB,  from  section  4.2,  to  es- 
tablish a  least  upper  bound  on  the  number  of  machines  required 


1 1 


(i.e.,  sufficiency)  to  process  the  graph  in  the  critical  time.  The  last 
two  sections  discuss  a  non-optimal,  but  faster  scheduling  algorithm 
and  how  to  use  Algorithm  a  to  optimally  assign  nodes  to  a  system  of 
nonequ i potent,  parallel  machines,  respectively. 
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2.  ARITHMETIC  EXPRESSION  PARSE  TO  REDUCE  SYNTACTIC  TREE-HEIGHT 

2. I .   I ntroduction 

Recognition  of  parallelism  within  an  arithmetic  expression  having 
distinct  operator  weights  in  order  to  produce  reduced  height  in  the 
syntactic  tree  is  discussed  in  this  chapter.  The  time  that  a  system 
of  parallel  processing  elements  requires  to  calculate  an  arithmetic 
expression  is  governed  by  its  tree-height.  Other  investigators  have 
developed  parsing  algorithms  to  achieve  this  end  by  assuming  that  all 
operations  require  unit  time. 

Baer  and  Bovet  (2)  have  successfully  reduced  tree-height  of  an 
arithmetic  expression  containing  addition  and  multiplication,  by  using 
the  associative  and  commutative  laws.  They  used  a  stack  mechanism  to 
parse  a  +  b  +  c  +  def  +  g  +  h  into 


(((a  +  b)  +  (c  +  g))  +  (d  e)  f)  +  h 


It  is  easily  seen  that  four  steps  are  required  on  a  parallel  machine. 
Muraoka  (28)  added  the  distributive  law  to  the  work  of  Baer  and 
Bovet.  Two  situations  permit  distribution:   a)  when  holes  in  the  tree 
are  present,  e.g. , 
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A  =  a(b   c  d   +  e)        is   distributed   as         Ac'   =  abcd+ae 


|    \/     |        I  \/      \/         \/ 

o      *      o        o  *        *             * 

\  \/  /  V    / 

O      *            O  v               o 

\  \/  \/ 


o  + 

\/ 

* 


id 


where  A  is  executed  in  four  steps  wh i I e  A   is  executed  in  three  steps 
on  a  parallel  machine;  b)  when  space  in  the  branches  are  present,  e.g., 

A  =   a(b   c  +  d)    +  e        is   distributed   as       A     =abc+ad+e 


\  V     I     / 

o  *  o       c 

\  V    / 

o        +  o 

\/    / 

*  o 

\/ 


\/  1 

\/° 

\l  1 

V 

\. 

+ 

where  the  height  of  A  is  4,  while  the  height  of  A  is  3.   Integer  sets 
identify  the  existence  of  free  nodes  on  the  tree;  set  operations  are 
used  to  determine  if  the  proper  conditions  exist  to  distribute. 

When  the  operators  do  not  have  unit  weight,  then  the  theories  of 
Baer-Bovet  and  Muraoka  for  expression  parsing  such  that  tree-height  is 
reduced  are  no  longer  valid.   Observe,  for  example,  when,  add-weight, 
vi     -   2   and  mu  I  ti  p  I  y-weight,  w  =  3,  that  one  way  to  parse 
E=a+b+c+  def  +  g  +  h  such  that  tree-height  is  reduced  is:  (c.f. 
above) 
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E'  =  (((a  +  b)  +  (c  +  g))  +  h)  +  (d  e)f 


0- 


In  this  case,  the  minimum  tree-height  of  E  is  eight  whereas  the  tree- 
height  of  the  Baer-Bovet  parse  with  these  operator  weights  is  ten. 
Clearly,  any  parsing  scheme  which  reduces  tree-height  is  a  function  of 
the  grammar  as  wel I  as  the  operator  weights.  We  have  already  used  the 
terms  "minimize"  and  "reduce",  and  will  continue  to  use  them  throughout 
this  chapter;  the  reader  should  not  confuse  their  meanings. 


2.2.   Notation 

Certain  mathematical  tools  are  necessary  in  order  to  perform  the 

expression  analysis  with  weighted  operators.  The  notation  used  here  is 

evolutionary  from  that  of  Muraoka.   Let  hCEH  be  the  minimum  tree-height 

of  E,  relative  to  the  reduction  algorithm,  and  let  e[_E^\  -   2nLEJ  be  the 

effective  length  of  E.   Also  let  h  CeH  be  the  g-height  of  E,  where 

a  x  a,m,d  for  add,  multiply,  or  divide  respectively,  w   be  the  a-operator 

h  [E]  a 

wei  ght,   and  e  [E[]  =  2       be  the  effective  a- length  of  E  . 


Def  i 


nition  2.1:   h  [E]  ='i£S 


That  is,  no  matter  what  E  is,  we  can  find  out  how  many  a-steps  would  be 
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required  if  E  consisted  of  a-comb i nations  of  atoms  (i.e.,  simple  var- 
iables) alone.   For  example  if  hCEH  =  8  and  w  =  3,  then  h  L"ED  =  -2-  or 

2 

that  2  —  multiply  steps  would  be  required  to  achieve  a  tree-height  of 


Definition  2.2:   w   is  an  integer. 
a 

There  is  no  loss  of  generality  with  such  an  assumption. 

Lemma  2.1:   hF_El]  and  e[_Ej     are  integers. 

Since  the  a-operations  are  discrete,  it  follows  that  h^El]  and  hence 
eL~EH  are  integers,  which  is  consistent  with  Definition  2.2. 

l/wa 
Lemma  2.2:   h  TeI]   is  a  rational  number,  and  e  FEU  =  (eCEH)       is 

a  real  number. 

The  fact  that  e  CeH  is  real  poses  some  problems.   Some  of  the  subse- 
a 

quent  analysis  requires  an  evaluation  of  £e  CE-ll  ,  where  the  E.   are 

subexpressions  of  E  .   Since  these  numbers  are  real,  their  summation 

on  a  finite  word-length  machine  will  lead  to  errors.   Furthermore,  it 

would  be  nice  to  avoid  evaluation  of  fractional  exponential  functions 

as  much  as  possible.   The  following  analysis  permits  a  transformation 

of  these  reals  to  the  integer  domain. 

Let  (Z,+)   be  the  additive  group  of  the  integers  Z  .  The  residue 

class  0..  =  {...,  -2w  ,  -w  ,  0,  w  ,  2w  ,...}   is  also  an  additive 
a  a'   a      a    a 
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group.   Let  qO  =  { . . . ,  -2w  +  q,  -w  +  q,  q,  w  +q,  2w  +q,...} 

be  the  coset  defined  by  q  e  Z  .  Then  if  X  =  {xlx  =  V2q ' /wQt ,  q  e  qO 

q     •    *      'Hi   M  a 

for  some  q}  ,  it  is  easy  to  show  that  (X  ,+)   is  a  group  which  is 

isomorphic  to  (Z,+)  .  The  q  which  are  restricted  by  0  <  q  <  w 

—     a 

define  all  possible  cosets  (or  residue  classes)  and  only  these  q 

are  used.   (X  ,+)   is  the  group  generated  by  the  q   coset;  they  are 

called  generically,  the  "coset  groups." 

Note  that  each  eaEEH  e  (X  ,+)   for  some  q  .  To  evaluate 

£e  CE.H  ordinary  addition  may  be  used  to  combine  any  two  elements 

from  the  same  coset-qroup.  That  is  if  e  TE-Il  e  (X  ,  +  )  and 

a   r  a   i      q' 

e  [E;D  e  (X  ,  +  )  then  e „CET]  +  e  [E  -J   e  (X  .  +  )  .  However,  if  two 
«*  J      q  *-*   i     ot  J      q 

elements  are  in  different  coset-groups,  then  some  other  method  of  com- 
bining them  must  be  performed. 

Definition  2.3:   Let  T   or  +   mean  summation  within  coset-groups. 

Definition  2.4:   Let   i   ~j   mean  summation  between  coset-groups  viz: 
starting  with  the  smallest  coset-group  element  (i.e.,  the  smallest 

among  the  reals),  add  it  to  the  next  larger  element  as  if  they  were 

equal  in  size  performing  the  +   summation  where  necessary,  recursive- 

c 

ly,  until  a  single  2's  power  number  remains. 

The  syntactic  analysis  in  the  following  paragraphs  develops  trees 
out  of  subtrees  by  discrete  combination.  Suppose  subexpressions  E. 

and  E„  are  being  combined  by  some  operator  a  .   Then 

hllE.  a  E0H  =  w  +  max(hHE  J,hf_E0I])   is  a  discrete  combination  of   E. 
^     a  z 
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and   E     In  terms  of  effective  length,   eCE,  a  El  = 

w  +max(h[E,],h[Ej)  max(h  [E,],h  [Ej 

2  a        '    "  2    and  e  CE,  a  E?:  =  2(2     a      a  2  )  .   Def- 

a   I     ^ 

initions  2.3  and  2.4  are  used  to  perform  this  discrete  combination;  in 

Definition  2.4,  a  discrete  summation  is  used  whenever  the  terms  are  of 

unequal  size.   In  the  following  analysis  of  discrete  summation,  all 

literals  are  positive  integers. 

P.,        P., 
Def  i  ni tion  2.5:   The  discrete  summation  of   2  l/r  and  2  J'r   is 

2*max(2  '/r,2  j/r)  . 


Clearly,  when  p.  ^  p.  there  is  a  certain  amount  of  waste  when 
i     i 

Pi/r        Pi/r 
the  discrete  summation  of  2      and  2  J    is  performed.  This  waste 

is  ca I  led  si ack. 

P  /     P  . 
Definition  2.6:   If   p  >  p   we  let  slack  be  s    =  2  k/r  -  2  J/r  . 
Pk-Pj        j,k 

Hence,  the  discrete  summation  may  also  be  expressed  in  terms  of 

Pj,        P:/r 
slack.   If   p.  >  p.  ,  then  the  discrete  summation  of   2     and  2  J 

is  2Pi/r  +  2PJ'/r  +  s.  .  =  2]/r   +  2PJ/r  +  (2Pj/r  -  2P'/r)  =  2(2^)  . 
It  is  desired  to  perform  the  discrete  summation  of  more  than  two  terms 
of  the  type  2^   .   The  following  lemma  follows  directly  from  the 
def i  ni tions. 

Pi/r 
Lemma  2.3:   The  discrete  summation  of  terms  such  as  2     ,  for 

i  =  I ,2, . . . ,m   is 


m  p.  .    m- I 

y  2  ,/r  +  y  s.  .    =  2P/r 
1=1  1=1  ''J+l 

for  some  p  . 

p/r 
Before  attacking  the  problem  of  minimizing  2    ,  let  us  examine 

the  following  lemmas. 

m   p  n  n .  . 

Lemma  2.4:  [c  2  ' 'r  may  be  expressed  as   £c2  J    where  the  n. 

i=l  j=l                 J 
are  unique. 

This  follows  directly  from  the  definition  of  V   and  the  fact 
that  (X ,+)  is  isomorphic  to  (Z,+)  for  q  =  0,l,...,r-l  .   Since  the 


n.  are  unique,  they  may  be  ordered  such  that  n.  <  n„  < 

J 


<    n 
n 


Vr  ni/r 
Lemma   2.5:      If     n     <  n       ands          =2            -2J  then 
j          k                  j,k 

k-l 

s         =     J"  s.    .        . 

j>k       i=j    '>'+l 

Lemma  2.6:   If  the  conditions  of  Lemma  2.5  are  satisfied,  then 

I  /r 
s    <  s    <  ...  <  s      ,  and  s.  ...  <  2"  s;,,  ..«,  . 
1,2    2,3         n"l,n         i,i+l  -      i+l,i+2 

The  proof  of  Lemma  2.6  depends  on  the  fact  that  n.  <  n.   -  I  . 
r  i  —  i+l 

Now  the  following  theorem  may  be  proven. 

Pi/r 
Theorem  2. I :   No  discrete  summation  of  2     ,   i  =  l,2,...,m  yields 

m  p.  . 
a  smaller  2p/r  than  that  found  by      \c2    l/r  ~| 

i  =  l       c 


Proof:   By  Lemma  2.3,  the  discrete  summation  produces  a  minimum  2 

is  minimized.   It  can  be  shown  from  the 

m 
Ic2 


p/r 


m-l 
f  and  only  if   £  s 
i  =  l 


i  ,1  +  1 


definitions  and  Lemma  2.4  that 


m  p.  , 
v  i/r 


n  n.  ,    n- 1 
I  2  l/r+  Is 


I  ,  l+l 


In  other  words,  the  discrete  summation  produced  by  the         method 

c 

includes  s     in  the  slack-sum  and  then  eliminates  part  or  all  of 
1,2 

s     from  the  remaining  slack;  i.e.,  at  least  2s  ?   is  removed 
z ,  J  i  ,  z 

from  the  remaining  slack.   By  Lemma  2.6,   s.  -   is  the  smallest  slack 
and  at  each  recursion  of      ~|    the  smallest  of  the  remaining 
slack  is  added  to  the  slack-sum.  On  the  other  hand,  any  other  dis- 
crete summation  not  of  the  two  lowest  terms  can  remove  at  most  s 
from  the  system,  but  more  than  S.     is  added  to  the  slack-sum,  as 


can  be  shown  by  Lemma  2.5, 


Q.E.D. 


There  is  some  additional  notation  required  for  division;  however, 
it  is  more  convenient  to  include  it  in  section  2.4. 

An  example  of  the  \        and         discrete  combination  methods 

will  clarify  these  functions.   Let  r  =  2  and   p.  =  0,   p~  =  5  , 

P.,   P,, 
p3  -  2  ,  p4  =  I  ,  and  p  =  I  .   The  set  {2  l/r,2  :>/r}C(X  ,  +  )   and 

r9P2/r  9P4/r  9P5/r,  <—,.,   ,.    u  9Pl/r    ^3/r   fv      ,.     , 

{2    ,2    ,2    }  L-(X  ,  +  )  .   Hence  2     +2     e  (X  ,  +  )   and 


0 


2Vr  +  2P4/r  +  2P5/r   B    (X|.+)    ,    or     20/2  +  22'2   e    (XQ,+)      and 

95/2        J/2        _l/2  5/2    ,    „3/2 

2  '      +2  +2  =2  +2  e(X,+)      respectively.       In  other 


words 


20 


v5  p 

i  =  l 


It 


20/2  ♦  22/2  c  (XQ,+)  , 
25/2  +  23/2  g  (x   j 


(2.1) 


Since  the  (X  ,+)   are  isomorphic  to  (Z,+)  ,  we  may  write  (in  binary) 

-  0/2    2/2  - 

11:0   in  place  of  2    +2    e  (X  ,  +  )  ,  where  0   indicates  that 

2/2 

(X  ,+)   is  the  group,  and   10  represents  2    ,  01   represents 

;  .   (Note  that  20/2  +  22/2  ■*  01  +  10  ■     II.)   Simi  larly  write 

)f 
,5/2  All    ,  J/2 


110:1   in  place  of  25/2  +  23/2  e  (X,,+)  .   (Also  note  that 


+   2  +  2 


100+001    +001    =    110    .)      Hence, 


Ic2 


pi/i 


I  I  :0 
I  I0:T 


(2.2) 


s  equivalent  to  equation  (2.1). 


Eva  I uation  of 


Ic.2 


Pi/. 


proceeds  in  the  following  way: 


0/2 

Using  the  notation  of  equation  (2.1),   2  '    is  the  smallest  coset- 


group  element  and  2 


2/2 


is  the  next  larger.  Hence 


20/2+22/2 


,4/2 


,  by  definition,  and 


=  i 


pi/r 


20/2+22/2  e  (xo)+) 
25/2+23/2  e  (x   } 


24/2e(X  ,+) 
0 

25/2+23/2e(X,,+) 


In  the  notation  of  equation  (2.2), 


L2 


Pi/, 


-c 
=  1 


I  1:0 
I  I0:T 


00:0 
I  I0:T 


Now  2     is  smallest  and  2  '    is  next  larger.   Since 


23/2  +  24/2 


or, 


l/]/r 


i  =  l 


,6/2 


24/2     e(X0,+) 
25/2+23/2e(X,,+) 


100:0 
I  I0:T 


26/2e(Xo,+) 
25/2e(X,,+) 


000:0 
I00:T 


in  the  notations  of  equations  (2.1)  or  (2.2)  respectively.   Finally, 


?  P 
Ic2 
=  1 


i/r 


26/2e(XQ,+) 
25/2e(X,,+) 


28/2e(XQ,+) 


=  2 


8/2 


or 


Ic2 
=  1 


l/r 


1000:0 

10000:0 

= 

=  10000:0 

100:  1 

0:  1 

,8/2 


2.3.   Tree-height  Minimization  of  Monolithic  Sums  and  Products 

In  this  section,  we  consider  arithmetic  expressions  which  are 
either  a  sum  of  terms  or  a  product  of  factors.   The  productions 
which  specify  a  grammar  of  these  expressions  are  the  following: 

E+  F|T  , 

F  +  a|(T) |F  *  F  , 

T  ->  FlT  +  T   . 


Production  F  represents  a  product  of  factors,  production  T  represents 
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a  sum  of  terms,  and  terminal   a   is  an  atom  from  the  set  of  distinct 

variable  names.   Furthermore,  it  may  be  written  for  all  factors  f 

i 

and  terms  t. 
J 

m; 


I   t  ,  and 
J-l  J 


J   1  =  1 

The  f   are  called  monolithic  if  factoring  among  the  t.   is  not  con- 
i  j 

sidered  and  the  t.  are  called  monolithic  if  distribution  amonq  the 
J 

f j   is  not  considered.   In  other  words  the  internal  characteristics  of 

a  monolithic  f.   or  t.   are  not  considered, 
i       i 

The  theorems  below  follow  directly  from  the  definitions  and 
Theorem  2.1. 


Theorem  2.2:   If  E  =  )  t   and  each  t.   is  monolithic,  then 

i  =  l    ] 

~w 


h[E]  =    log2 
and      hCEj      is  minimized. 


i  =  l 


Ic  eaCtJ 


Theorem  2.3:   If  E  =  i   f.   and  each  f   is  monolithic,  then 
f  =  l  ' 


hEEH  =  log. 


'V  m1-  i 


m 


and  hCEll  is  minimized 
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Let  us  demonstrate  the  use  of  these  theorems  to  evaluate  the  ■ 
minimum  tree-height,  and  at  the  same  time  the  syntactic  tree,  of  the 
expression  E=a+b+c+  def  +  g  +  h  .   Let  w  =  2  and  w  =  3  , 


Then  E  =  £t.  ,  where  t  =  a,  t„  =  b,   t,  =  c,  t.  =  def,  t(-  =  g 
and  t,  =  h  .  All  of  these  terms  are  simple  except  t*  which  is  a 


product  of  factors,  i.e.,   t.  =  |"Jf.  ,  where  f  =  d,   f  =  e,   f  =  f  . 
Suppose  that  a  memory  fetch  of  a  variable,  v  ,  requires  zero  time, 
i.e.,   h£v!]  =  0  ;  then  e[]vll  =  I  . 

For  each  subexpression  E.  ,  let  us  represent  e  HE.!]  as  a  binary 

Pi/r 
inteqer  in   (X  ,+)   in  the  followinq  way.   Since  e  CE.l  -  2    ,  then 
y    n    q  a   y  a   r 

include  2  '  ,  where  n.  =    p. ,  (i.e.,  integer  divide)  in  the 

q   coset-group,  where  q  =  p. (modulus  r)  .  Then  the  analysis  of  t 
yiel ds, 


lc^-? 


+1+1 :0 

0:i 
0:2 


11:0  ' 

= 

0:1 

I     0:2  J 

for  the  3  coset-groups,  or  residue  classes,  0,   I,  and  2  respec- 
tively.  ieY(f\   =  e  Cell  =  e  [fH  =1   in  coset-group  0*  .)  The   11:0" 

'  m1-  m  m  a        v 

represents      2  +  2  '         in   coset-group    (X    ,+)    .      Thus, 


ect :  = 


Ic^fj] 


"  3 

1  1  :0 

3 

= 

0:1 

= 

c 

0:2 

r- 

00:0 
0:J_ 
0:2 


(26/V  -  26 


and     e  [tj  =   e[tjl/2  =   26/  '  =    1000:0      in      (X„,  +  )    ,   where     w     =   2    . 
a     h  4  0  a 

Then, 
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LeaEtn  = 


l+l+l+IOOO+l+l :0 


-c  a1-'  i 


101  :0 


for  the  2  coset-groups  0  and  I  .  Thus, 

i2 


eCED  = 


lAnt,: 


101:0 

2 

10000:0 

2(28/2)2  . 

=  28 

0:1 

0:1 

and  h[E]  =  log  eCEl]  =  8  . 

Note  that,  in  addition  to  finding  hCEj  ,  the  order  in  which  the 
lists  are  combined  during  the  £   and     ]   operations  indicate  the 
exact  way  that  the  terms  in  E  are  combined  to  achieve  hQG  .   For  an 
additional  example  see  Figure  2.1. 

The  subtraction  operator  has  the  same  precedence  as  addition  in 
our  grammar  for  arithmetic  expressions.   If  it  is  assumed  that  the 

operator  weights  are  the  same,  i.e.,  w  =  w   ,  then  Theorem  2.2  may  be 

^    a 

applied  to  expressions  which  have  a  mixture  of  terms  combined  by 

addition  and  subtraction.   If  all  the  operators  in  the  expression  are 

n 
subtraction,  i.e.,  the  expression  may  be  written  as  E  =  -£t.  ,  then 

n 
the  unary  minus  may  be  eliminated  by  writing  E'  =  0-£t.  .   Depending 

upon  the  nature  of  the  expression  E  ,  the  tree-height  of  E'  may  or 

may  not  be  greater  by  w  ,  than  that  of  E  . 


2.4.  Tree-height  Minimization  of  Monolithic  Expressions  with  Division 

The  division  operator  has  the  same  precedence  as  multiplication; 

however,  on  most  computers  w  ,  >  w  .   If  it  is  desired  to  parse  an 

r       d  —  m  r 
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t     =   cd:  e[t3D  = 


t     =  e(f+g):    e[f+gD 


™ 

3 

l  +  l  :0 

10:0 

0:  1 

= 

0:1 

0:2 

c 

0:2 

" 

2 

l  +  l  :0 

10:0 

0:T 

c 

0:1 

=   (23/3)3  =  23 


=   (22/2)2  =  22 


e[t4] 


3 

1  :0 

0:0 

0:  1 

= 

0:1 

1  :2 

c 

10:2 

(25/3)3  .  25 


E  = 


4 


e[E] 


1  +  1:0 

0+100:1 


10:0 
I0:T 


0:0 
000:1 


-    (27/2)2   -   27 


0- 
I- 
2- 
3- 
4- 
5- 
6- 
7- 


E     =      a     +     b      +     c^d     +     e*(f      +     g) 

\/         \/        \       \/ 


Figure   2.1:    Syntactic  Tree  of 
E=a+b+cd+e*    (f  +  g) 
with   Coset-group  Analysis. 
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expression  of  the  type  E  =  "ff./f  such  that  tree-height  is  minimized, 
it  is  necessary  to  select  the  right  factor,  or  product  of  factors,  in 
the  numerator  for  combination  with  the  divisor  in  a  non-obvious  way. 


Def  i  ni tion  2.6 :   I 


f  E-fr 


fs/fH  ,  then  e m[E]  = 


:cem:fi>dem[fd: 


The     ...+...   I   symbols  are  interpreted  as  follows:  Recall  that 

p  n   n  j /w 

)   e  Cf;H  =  V„  2       where  n.  <  n. , ,  .   Let  us  simplify  nota- 
. Lc      m  i    .LQ  i    i+l  K 

i=l  i=l 

tion   by   setting      q.    =   n./w        and     w'    =  w  ,/w      .      Then, 


I )     If      q  ,   <   q .    ,    then 


^|  ^d 

2       +J  2 

d 


q,+w;  q] 

=2  =  2        ,    and 


e  [ED  = 
m 


n        q  q 

Ic  2       +c  2 

i=2 


2)    I  f      qj    <_  q      <   q      ,    then 


2    '    +  ,   2 

d 


qH+w'  q' 

=   ?   d      d   =   ?    I 
d   -   z  ■  z        , 


and     e  [Ej  = 
m 


n       qf  q 

Ic2        +c2 
i=2 


(If  q_  does  not  exist, 
c         2 


assume  that  q  <  q?  .) 


3)  If  q0  <  q  ,  <  q^  and  q  <  q0  +  I  ,  then  if  q  <  q,+  w'   then 
v  —  na        n3  nd    2  3    2    d 


oq2    0qd 
2   +^  2 
d 


9qd+Wd   9q2 
=2      =  2   ,  and 


m 


n   q,     q,     qj 
I  c  2   +c  2  '  +  2 
i=3 


(If  q,  does  not  exist 


assume  that  q,  <  q,  . ) 
d    3 
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4)  Otherwise,  combine  the  two  smallest  terms  under  the 

ql  q2 

operation  (i.e.,  eliminate  2    and  replace  2    by 

q2+i 

,  reordering  terms  if  necessary)  and  go  to  step  I. 


Theorem  2.4:       If      E 


I  i  th  ic,    then 


=  Jnvfd  >  m 


h[Ej  =    log, 


and     hCEU      is  minimized. 


w       and   the     f .    ,      f  ,     are  mono- 
d  ■       m  id 


e„[f;]  +,   e[fH] 


i  =  l 


mi         d     m     a 


Proof :   It  is  only  necessary  to  show  that  the  algorithm  in  Definition 

2.6  produces  minimal   e  CeH  . 
r  m 

Case  I)  q,  <  q.:  Slack-sum  is  minimized  when  f  /f   is  first 
d   ni  Id 

since  any  other  order  of  combination  includes  some  slack 

more  than  once  in  the  slack-sum  (c.f.  Theorem  2.1). 

Case  2)  q   :  q,  <  q  :  It  is  also  obvious  that  f|/fH  minimizes 

the  s lack-sum. 

Case  3)   q„  <  q  <  q  •   a)  I  f  q„+  I  <  q   then  combininq   (f  *  f-0 
M2  -  Md    3        H2     -  Md  I    z 

yields  Case  2  which  is  minimal,  b)  q  <  q„  +  I:   If 
q,  >  q  +  !  +  w'   then  no  matter  how  f  ,   f  ,   f,  and 
fd  are  combined,  hjlf  *  f  *  Vfd^  =  P3  +  '  •   lf 
q  +  I  +  w'  >_  q-^  _>  q?   +  w  l  then  it  is  best  to  combine 

(f.  *  f  )   first  since  h  [(f,  *  f0/f  .)  *  f,U  = 
I  m   I    2  d     3 

qd  +  wd  +  2 1  q2  +  wd  +  2  =  hmD<f  i  *  f2)/fd}  *  V  •    lf 

q     +  w'    >   q,      (and   hence     q  ,  +  w'    >_  q-J    then 


28 


hjcccf,  »f2)/fd)  *^  =  q2  +  w^2>hmCf|  *  (f2/fd)  *f3n 

fq3  +  2  if  q3>  q  +  w«  -  I 


q  ,  +  w'  +  |  if   q   <  q  +  w'  -  I 


Q.E.D. 


Figure  2.2  demonstrates  the  use  of  Definition  2.6  for  an  expres- 

6 
sion  E  =   jf./f   for  two  cases:   a)  where  1*0.1]  =  0  ,  lOoll  = 

h[f  H  =  I  ,   h[f4J  =  h[f5]  =  2,   h[f6U  =  15  ,  and  h^]  =  10  ;  and 

b)  where  hO^D  =  17  instead  of  15  and  otherwise  the  same  as  case  a). 

6 

These  two  syntactic  trees  of  E  are  in  fact  of  minimal  height,  as  the 
reader  may  wish  to  verify. 


2.5.   Pi  stribution 

In  this  section  we  consider  expressions  which  are  of  the  types 
(£+.)  *  f  >      <X+->/f  ,    and  (If.)    *    ($V.)  .   Lemmas  are  given  which 
show  how  to  determine  when  tree-height  is  reduced  by  distribution  of 
a  factor  over  a  sum  of  terms.  These  lead  to  algorithms  which  reduce 
tree-height  for  expressions  of  the  types   jf.  ,  £t.  ,  and 

(TTf-  )  /("[Tf ' )  when  the  factors  and  terms  are  not  monolithic,  i.e., 
i  ,      J 

each  f .  =  Tt 

Consider  E  =  (a  +  bcd)(e  +  f)   and  the  partially  distributed  form 

E  =  a(e  +  f)  +  bcd(e  +  f)  .  Then  syntactic  trees  for  E  and  E 

with  w  =  2  and  w^  =  3  are: 
a  3 
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a)      E' 

0  - 

1  - 

2  - 

3  - 

4  - 

5  - 

6  - 

7  - 

8  - 

9  - 
10- 
I  I- 
12- 
13- 
14- 
15- 
16- 
17- 
18- 
19- 
20- 
21- 
22- 
23- 


((f      *    (f      *   f    ))    / 
I  2  3 


)    *    ((f      *   fc)      *  f 

4  5  ( 


b)      E     = 

0  - 

1  - 

2  - 

3  - 

4  - 

5  - 

6  - 

7  - 

8  - 

9  - 
10- 
I  I- 
12- 
13- 
14--. 

15- 
16- 

17- 
18- 
19- 

20—  • 

21 

22- 
23- 
24- 


(((f      *    (f      *    f    ))    *    (f.   *   f    ))      /        fJ        )  * 

12  3  4  5  d 
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E  =  (a  +  b  *  c  *  d)  *  (e  +  f)   ,  E=a*(e+f)+b*c*d*(e+f) 


0  - 

1  - 

2  - 

3  - 

4  - 

5  - 

6  - 

7  - 

8  - 

9  - 
10- 


[  \l  I     V 


\ 


\    o 

v     / 


\  \/ 

o     + 

/ 


\/  i  y 

\  v 


\ 


In  this  case,   E  =  f  *  f_  ,  and  Y\\jT\   -    I  I  >  whereas  E  =  t  +  t^  and 

hL~E  H  =  10  .  The  difference  in  tree-height  is  due  to  distribution  of 

f   across  f ,  =  t, ,  +  t._  .  The  o's  in  the  syntactic  trees  above  i n- 
2  I    i  i    1 2 

dicate  the  presence  of  holes*  where  larger  subtrees  may  be  accommodated. 
Tree-holes  are  in  one-to-one  correspondence  with  the  slack  found  during 
the  use  of  the  \        "]   algorithm. 


Lemma  2.7:   Let  E 


p  p+l 

=  Tf  ,   E  =  f  *  (E)   and  E0  =  Tf.   wh 
i  2      I 


ere 


f    =  f  and  all  factors  be  monolithic.   If 
p+l 


P+1 
^c  mi 


<  2  * 


'c  m   i 


then  there  exists  a  hole  in  E  to  accommodate  f  ,  and  h[E  ]  <  hCE.H 

For  example,  suppose  E  =  (a  +  b)cd  ,   E  =  ((a  +  b)cd)e  and 

E  =  (a  +  b)cde  .   Since  E  has  a  hole  which  can  accommodate  e  then 
hCEjD  >  hCE^  ,  viz: 


•  c.f.  Muraoka  (28),  chapter  3 
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E  =  ((a  +  b)  *  c  *  d)  *  e  and  E  =  (a  +  b)  *  c  *  d  *  e 
I  2 


0- 

1- 

2- 

3- 

\/ 

/ 

4- 

\ 

/ 

5- 

\/ 

r 

6- 

\ 

7- 

> 

\ 

8- 

\ 

9- 

where     v» 

f     =   2     and 
a 

w 
m 

/ 


3fc_ 

/ 


o 


3  . 

Now  consider  any  expression  E  which  consists  of  arithmetic 

combinations  of  variables  v.,  v_,...,v  .   In  the  syntactic  tree  of 

I    2      n 

E  ,  there  is  at  least  one  path  from  the  root  at  level  h[_E~\     to  level 

zero  which  contains  no  slack.   Furthermore,  if   E  consists  of  a  set 

of  monolithic  subexpressions  E,,En,...,E  ,  there  is  a  path  from 

r  I   2      n 

level   h !ZeZ1  to  some  E.   at  level   hCE •  H  which  contains  no  slack. 

Def  i  ni  tion  2.7:   Subexpression  E.  of  E  is  said  to  be  domi  nant  in  E 
if  there  is  no  slack  in  the  syntactic  tree  on  a  path  from  the  root  of 
the  tree  of  E  to  the  root  of  the  tree  of  E.  . 

P  P 

Suppose  E  =  ^tj  or  E  =   |f.  ,  where  the  terms  or  factors,  re- 
spectively are  monolithic.   Previous  analysis  has  shown  that 

p       n  n-/wa 

)\e    =  Y  2      where  e  .   represents  eTt-J  or  e  [f  ]  re- 

n.  /w 
spectively.   Let  e  CE.I1  =  2    a  ;  i.e.,  subexpression  E.   corres- 
ponds to  the  sum  of  terms,  or  product  of  factors,  whose  effective 

n./w 
a-  ength  is  2 
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Lemma  2.8:   Let  E  =  £e.   or  E  =  ~[~E.  ,  where  the  E.   are  monolithic 


and  defined  as  above.   Subexpression  E.   is  dominant  in  E  if 

J 

nk/wa 
n.  =  max({n.  2      e  (X  ,+)})   such  that 


Kce;: 


e  (X  ,+) 

q 


Furthermore,  if  there  exists  an  E,   such  that  n.  =  n .  -  w   ,  then 

k  k    j    a    ' 

E.   is  also  dominant  in  E  .  Call  these  the  major-dominant  subexpres- 
sions of  E  . 

3  3 

For  example,  consider  E  =  £E.   and   E'  =  ~|~E.  ,  and  let 

h[E 3   =  2  ,   h[E0J  =  3  ,   h[E  1  =  5  ,  w_  =  2  and  w  =  3  .   Then  the 
I  Z  3         a  m 

syntactic  trees  of   E  and   E'   are: 


E  -  E,  +  E2  +  E3 


E'  -  E,  •  E2  .   E3 


0 

I  — 

2- 
3— 

4- 
5- 

6- 
7- 
8- 
9- 


Note  that     E„     and     E       are  major-dominant    in     E     and   only     E        is 
dominant    in      E'    .      The  coset-group   analysis    is: 


elE] 

a 


We  1 3 


10:0 
I0:T 


0:0 
1 000:1 


>7/2 


and 


e  [E']   = 
m 


RCE,] 


10:0 

= 

0:1 

=  . 

c 

1  1  :2 

c 

100:0 

1000:0 

0:1 

= 

0:1 

10:2 

r 

0:2 

_  o9/3 
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respectively.  Since  eaL~E_l  e  X.  ,  then   110:1   indicates  that  E 

and  E-,     are  the  major  dominant  subexpressions;  and  since  e  TE'I  e  X 
3  mo 

then  10:0  indicates  E^   is  the  major-dominant  subexpression. 

P |      P2  P2 

Lemma  2.9:   If  E  =  tt,  +'  (  Yt!)  *  f  ,   then  if   (  Ttf.)  *  f   is  not 

P2 
dominant  in  E  ,  then  distribution  of   f  across   Yt'   does  not  reduce 

L   J 

tree-height. 


Now  the  tools  have  been  developed  to  determine  when  distribution 

of  an  expression  such  as  E  =  (  )  t.)  *  f  reduces  tree-heiqht  when 

N1 
w  >_  w   .   Let  J  =  {i|t.  e  {E.}  or  t.  e  {E.,E^}  ,  i.e.,  the  set  of 

Pi 
major-dominant  subexpressions  of   E}  .   If  each  t.  =  Tf .  .   has  a  hole 

j=l   J 
to  accommodate  the  factor  f  ,  then  a  substitution  of  E^  for  E  is 

performed  where 

Ed  =  I   t.  *  f  +  (  I   t.)  *  f  .  (2.3) 

ieJ  '        i^J 

The  second  term  of  E   is  also  in  the  form  of   E  ,  and  if  it  is  dom- 
inant in  E  ,  then  it  is  desirable  to  perform  d i stri bution  on  this 
term  also,  by  applying  the  distribution  algorithm  recursively,  termin- 
ating when  some  term  is  found  without  a  hole.   The  general  form  of   E 
is 


Ed  =  y  t.  *  f  +  y  t.  *  f  +...+  y  t;  ^  f  +  (    I    tj  *  f  .    (2.4) 

1  \i   U  J  . 

d 
The  tree-height  of   E   is 
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h[E°]  =  log. 


ej 


Icejf:  J  *     eJf] 


^V  m-  i ,j 


J-l 


w  /w 

m  a 


+c  Ic 

ieJ 


.^fi,j^cemM 


wm/wa 


+c  L 

ieJ 


e m[f.  J  +„  eSfl 
m   |  i   c  m 


m'  a 


Ic   eTtJ 

n     a   ' 

M   u  J. 
J 

J-l 


w  /w 
a7  m 


+c  emM 


Wm/Wo 

m  a 


If  E  =  (a  +  bcd)(e  +  f)  ,  (of.,  p.  28  )  f    the  term  bed   is  dom- 
inant in   (a  +  bed)   and  also  has  a  hole  to  accommodate  the  factor 
(e  +  f)  .   Thus,   Ed  =  bcd(c  +  f)  +  (a)(e  +  f)  .   The  term  (a)(e  +  f) 
is  not  dominant  in  E   (and  (a)  has  no  holes)  so  no  additional  distri- 
bution wi  I  I  further  reduce  tree-height. 

Sometimes,  however,  even  though   (£tl )  *  f   is  dominant  in 
E  =  £t.  +  (£tr.)  *  f  and  there  is  some  dominant  term  t.   in  £t- 
which  has  no  hole  to  accommodate  f  ,  distribution  of  f  across 
£t.   reduces  tree-height  anyway.   For  example,  let  E  =  a  +  be  +  (de  +  f)g 


The  term  (de  +  f)g   is  dominant  in  E  and  de   is  dominant  In   (de  +  f)  , 
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but  has  no  hole  to  accommodate  g  .   Even  so,   E  =  a  +  be  +  deg  +  fg 
has  a  lower  tree-height  than  E  ,  viz: 


E=a+b*c+(d*e+f)*g   ,   E=a  +  b*c  +  f*g  +  d*e*g, 


0  - 

1  - 

2  - 

3  - 

4  - 

5  - 

6  - 

7  - 

8  - 

9  - 
10- 


\  \l    \l  I  < 

o   /       *   o     / 

\/       \/    / 

+  +      o 

V 


\  \l  \l  \l  I 

i  y  v 


where  w  =  2  and  w  =  3  .  On  the  other  hand,  if 

a  m 

E  =  a  +  be  +  (de  +  fg)h   then  this  phenomenon  does  not  occur  when  h   is 

distributed  across  (de  +  fg)   and  the  tree-height  is  10  in  either  case. 

Furthermore,  if   E  =  ab  +  cd  +  (ef  +  g)h  then  hCEj  >  h^E  ~\    ,  where 

E  =  ab  +  cd  +  efg  +  gh  ,  viz: 


E=a*b+c*d+(e*f+g)*h,E=a*b+c*d+g*h+e*f*h 


0  - 

1  - 

2  - 

3  - 

4  - 

5  - 

6  - 

7  - 

8  - 

9  - 
10- 


V 

\ 

O 

\ 


V  / 

+    o 

\/ 


\/   \/   \/   \/  I 

*  *  *  Mr 

/  / 

o 

/ 


The  reason  for  distributing  in  this  case  is  quite  different  from  hole 
distribution.   The  dominant  term  in  E   is  distributed  in  order  to 
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split  it  apart  into  several  smaller  terms  so  that  the  resultant 

"J- 

syntactic  tree  may  become  less  unbalanced.   Let  E  =  2,+  .  +  E'  *  f  , 

P2 

where  E'  *  f   is  dominant  in  E  and  E'  =  >  tj  =  ][e'  ,  i.e.,  a  sum 

1     i 

of  subexpressions  E!  which  are  defined  by  the  bits  in  the  coset- 

P2  n 

qroups    in  correspondence  with      )      e  Et.'H  =  I   e  L~E!3    •      Let     J      be  the 

L  c  a      i  tc   a     1 

set  of   major  dominant  subexpressions    in     E'      such   that  some  term 

t'    e   J      does   not  have  a  hole  to  accommodate     f    ,    and   hence     J      is   the 
i 

set  of    remaining   subexpressions   of      E'      not    in      J    . 


Lemma   2.10:      Given   the  above  conditions   for     E    ,    let 


P| 


I  t.  +     £_E!   *  f  +  <  J  E!>  *  f  . 


=  1 


i  eJ 


i  eJ 


f     e  [Eu] 
a 


Hl 

Iceant,:+C  Lc 

=  1  ieJ 


e  [E!]+  e  [f] 
m      i      c  mu 


w  /w 

m     a 


ic^mi 


i  eJ 


a^    i 


w   /w 

wa7    m 


+    e  m 

c     m 


w„/w 
nr    a 


P| 

I  ceart;>( 

i  =  l 


lc%^J 


wa/wm 


+  e  CfD 
c  m 


m     a 


=  e  [E]      , 
a 


then     h[Ed]   <   h[E]    . 
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This  lemma  is  obviously  true.  Basically,  a  check  is  made  to  see 
if  the  effective  add-length  of  E   is  less  than  the  effective  add- 
length  of  E  .   If  it  is,  then  h[EdJ  <  h[E]  . 

P,       P2 
Since  equation  (2.4)   is  of  the  form  E  =  j  t.  +  C  J  t!)  *  f  , 

i=l  '    i=l  ' 

the  final  step  in  the  distribution  algorithm  is  to  check  for  a  reduc- 
tion in  effective  length  by  decomposition  in  the  following  way:   If 

(  )  tj)  *  f   is  dominant  in  E  ,  then  if  e  [E  ]  <  e  EEL]  ,  where 
1=1  '  a       a 

H    P| 

E  =  I   f-  +  LE!  *  f  +  (  X  E!)  *  f  (2.5) 

1  =  1  '    ieJ  '        ieJ  ' 


then  equation  (2.5)  is  the  desired  form  of  E  .  Otherwise  equation 
(2.4)  is  the  desired  form  of  E  . 

By  application  of  lemmas  2.7,  2.9,  and  2.10,  one  can  easily  show 
that  any  further  distribution  of  a  monolithic  factor  f  across  sum- 
mation of  terms,  which  are  products  of  monolithic  factors,  does  not 
reduce  tree-height  and  thus  hEE  2      's  minimized  with  respect  to  mono- 
lithic factors.   In  summary,  the  Multiplication  Distribution  Algorithm 
is  the  recursive  application  of  Lemmas  2.7  and  2.9,  followed  by  the 
application  of  Lemma  2.10. 

As  an  example  of  the  above  distribution  algorithm,  consider 

E  =  (a  +  b  +  (cde)  +  f  +  q)h)i   where  hCEj  =  16  when  w  =  2  and 

3  a 

w  =  3  ,  viz: 
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E  = 


0  - 

1  - 

2  - 

3  - 

4  - 

5  - 

6  - 

7  - 

8  - 

9  - 
10- 

I  I- 

12- 

13- 

14- 

15- 

16- 


(a+b+(c*d*e+f+g)*h)*i 

\/       \/    /      "      '     ' 

\/°    / 


Ct,  +  t-  +  t  )  *  f 

1    2    3     | 


\     \  /    ? 


\ 


/ 


\ 


\ 


/ 


/ 


Since  t^   is  dominant  in  t  +  t  +  t  ,  and  cde   is  dominant  in 
3  |    2    3       d 

t  which  contains  a  hole  to  accommodate  h  ,  E    is  found  viz: 


E  '  =(a+b+(f+g)*h+c*d*e*h)* 
0  - 


'WW  M  f 


2  - 

3  - 

4  - 

5  - 

6  - 

7  - 

8  -• 

9  - 
10- 

I  1- 
12- 


V     V  J    \/    V   I 

7   / 


Now  t   is  dominant  in  t  +  t-  +  +_  +  "K  and  t,  contains  a  hole  to 
3  I    2    3    4       3 

d? 

accommodate  f   .   Hence  E  c      is  found,  viz: 
I  
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2  _ 


(f+g)*h*i+(a+b+c*d*e*h)*i=t  +(t'+tl+t!)*f 

|    I   2  3    | 


0  - 

1  - 

2  - 

3  - 

4  - 

5  - 

6  - 

7  - 

8  - 

9  - 
10- 
I  I- 
12- 
13- 


V    V     V 


so 


\ 


\ 


\ 


V  / 


\ 


Now  (tf  +  ti  +  t')  *  f   is  dominant  in  E   ,  and  t'   is  dominant 
I    2    3     I  '3 

in  t'  +  ti  +  ti  but  t4  contains  no  hole  to  accommodate  f.  .   Never- 
|    2    3       3  I 

d3       d2        d3 
theless,  e  [E  ~\   <   e  [_E     J  and  E    is  parsed  in  the  following  way: 
a        a 


0  - 

1  - 

2  - 

3  - 

4  - 

5  - 

6  - 

7  - 


10- 
i  I- 


E  3  =  (f  +  g)*h*i  +  (a  +  b)*i  +  (c*d*e*h)*i  =  t,  +  E'  *f  ,  +  (E')  *  f. 
a  |     I    I     2     I 

v  v  y  i  v  i/ 
\/ 


*  * 


\/ 


/ 


/ 


\ 


In  this  case  (E*)  *  f   is  still  dominant  in  E   ,  but  no  further  de- 


composition is  possible,  so  h[E  H   is  distri buti ona I ly  minimal. 
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We  now  consider  expressions  of  the  form  E  =  (~~t.)/f  and  de- 
termine when  distribution  of  f  over  £t.  ,  using  division  reduces 
tree-height.  The  methodology  is  essentially  the  same  as  distribution 
using  multiplication. 

Lemma  2.11:    Let  E  =  \T  f ,  ,   E  =  (E)/f  and  E  =  f  *f  *. . ,*f  /f  , 
|  =  | I   I     I  2    12      p 

all  factors  be  monolithic,  and  w  >  w   .   If 

d  —  m 


;re  L~f.II  +,  e  [f] 
'Q   m   i    dm 


<  2 


w  ,/w 

d  m 


Ice  CfJ 

_  i  m   i 


i  =  l 


then  there  exists  a  divide-hole  in  E  to  accommodate  f  ,  and 
hL~E  U  >  h[E  ~]  . 

p|       P2  P2 

Lemma  2. 12:   If   E  =  j  t.  +  (  "  t|)/f  ,  then  if   (  J  t!)/f   is  not 

i=l  '    i=l  '  (=1  ' 

dominant  in  E  ,  then  distribution  of  f  across 

P2 
J"  t!   does  not  reduce  tree-height. 
i  =  l  ' 


A  Division  Distribution  Algorithm  is  essentially  the  same  as  the 

multiplication  distribution  algorithm,  if  Lemmas  2.11  and  2.12  are 

applied  in  place  of  Lemmas  2.7  and  2.9  respectively. 

n       m 
Suppose  there  is  an  expression  such  as  E  =  (  "•  t.)*(  ]"  t')  . 

i=l  '    i=|  J 

Under  what  conditions  does  distribution  reduce  tree-height  and  what  is 

the  maximum  possible  reduction?  The  following  theorem  answers  these 

questions . 


i 
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Theorem  2.5:   Let  E  =  (  Y  t.)*(  Y  +'>   and  E  =  t.t!  +  t  t'  +  ... 
i=|  '    j=l  J  I  I     I  2 

+t  f  +  ...  +  t  t'  .   Let  n  >  m  .   If  m  >  2  then  h[Ed]  >  h[E] 
I  m         n  I  —  —  — 

and  if  m  =  I  then  h[E  H  >  h[EU  -  w 

—        a 

n  m 

Proof:  Let  E  =  A  *  B  ,  where  A  =  Y  t.   and  B  =  Y  tf  .   Without 

1  =  1  '  J  =  l  J 

loss  of  generality,  assume  that  h[7\3  >  h[j3j  .   By  definition, 


eaCE°]  . 


n   m 


Ic  L  e  [t.  *  t'J 


Since  eSf-    *  t'J  >  e  [t.]  , 
a   i    j  -  a   i 


then  e  [E  ]  >  m 
a     — 


^c  a   i 


=  me  TaH  .  Since,  by  assumption,  A   is 
a 


the  larger  factor  of  E  ,  then  e  [Aj  -  1/2  e  CeD  .  Hence 

a  a 

e[E]>!e[E]     which    imp  I  ies   that     h[EdD  >_  h[E]  +  w    (  log  m- I )    . 


2  a 


a,lws2 


Q.E.D. 


Theorem  2.5  verifies  that  we  need  only  consider  distribution  to  reduce 

n       m 
tree-height  of  expressions  such  as  E  =  (£t.)  *  (£tT.)  when  either 
n         m  J    n 

(£t.)   or   (£t'J   is  monolithic,  i.e.,  only  when  E  =  (£t.)  *  f  . 

n 
Let  us  reconsider  expressions  of  the  form  E  =   ~J~f.  .   Suppose 


the  factors,   f.   are  not  monolithic,  i.e.  suppose 


"I  "2 

E  =  (  I   t,  .)  *  (  I   t   )  * 


*  (  £  t   .)  .   Is  it  possible  to  reduce 


:_l    m>J 


tree-height  of      E     by   distribution?     By  applying   the   principles   of 

Lemma   2.9   and   Theorem  2.5,    it    is. 
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Lemma  2.13:   Let  E  =   [f.    ,   where  the  factors,   f   ,  are  not  mono- 

i  =  l  '  ' 

lithic.  The  tree-height  of  E   is  reducible  through  distribution  if 

and  only  if  there  is  exactly  one  dominant  factor  in  E  . 

n 
Let  f,   be  the  only  dominant  factor  in  E  =   Tf   .  Then  we 

\ 

may  write  E  =  (  £  t.  ,)  *   |fi  .  Distribution  was  previously  inves- 

tigated  for  expressions  of  the  form  (£t.)  *  f    .   In  this  case,   f   is 
replaced  by  a  product  of  factors.  The  following  lemma  is  a  generali- 
zation of  Lemma  2.7. 

P,  P,        p2 

Lemma  2. 14:   Let  E  =  ]Jf .  ,   E  =  (  J]f  )  *   ]J  f.   and 


i  =  l 


i=p+l 


P|+P2 


E  =      f .  ,  and  a  I  I  factors  be  mono  I i  th  i  c .   If 
2    1  =  1   ' 


P,+P2 
i=?  m   ' 


<  2 


I     e  Lf.J 
uc     m  i 


then  there  exist  holes  in  E  to  accommodate   Tf.   and 

i 

h[E  J  <  h[El  . 
2       I 


Hence,  distribution  is  performed  exactly  as  before  with 


E  =   !+ki*  ^fi  +  (I  tk  .)  -  TTf   , 

jeJ  K'J   i^k  '    j^J  k'J    i^k  ' 

nk. 

where      J      is   the  set  of   major-dominant  terms    in        >   t,     . 
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The  Distribution  Algorithm  for  E  =  ~]~f .  (non-monolithic  f)  is 

the  application  of  Lemmas  2.13  and  2.14  followed  by  the  application  of 

the  mu I ti p I ication  distribution  algorithm. 

3 
For  example,  suppose  E  =  (a  +  bc(d  +  e)(f  +  gh))i(j  +  k)  =  ]~ff 

i 
The  only  dominant  factor  in  E   is  f   ,  and  bc(d  +  e)(f  +  gh)   is 

the  dominant  term  in  f   ,  as  may  be  seen  in  the  following  illustra- 
tion: 

E=(a+b*c*  (d+e)  *  (f  +  g*h))  *  i  *  (j  +  k)  =  f  *  f2  *  f3  . 
0  - 


:  \  V  )'   \  I      y 

\  \  /       \/ 


3  -      o    *     o 

4  - 


5  -      o     \  /  + 

6  -       \     *.  o 

7  -       o 
8-        \ 

9  -        o. 

10- 

II- 

12- 

13- 

14- 


Since  holes  exist  in  the  dominant  term  of  f   to  accommodate  i*(j+k)  , 
we  may  distribute  and  form  E  =  bc(d+e) (f+gh) i ( j+k)+(a) i ( j+k)  ,  and 
the  syntactic  tree  is: 
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E  =  b  *  c  *  i  *  (d+e)  *  (j+k)  *  (f+g*h)  +  a  *  i  *  (j+k)  =  t.  +  t. 


0  - 

1  - 

2  - 

3  - 

4  - 

5  - 

6  - 

7  - 

8  - 

9  - 
10- 
I  I- 
12- 
13- 


v  I  i'  v  iu  v  y 


Similarly,  let  us  reconsider  expressions  of  the  form  E  =  £  t.  , 


1  =  1 


where  the  terms,  t.  ,  are  not  monolithic,  i.e.,  suppose 


z  m 

E  =   Tf    +  TTf0  .  +  ...+  TTf    •   Let  J  be  the  set  of  major 
j=|  l,J   j=|  2, j        j=|  m,j 

dominant  subexpressions  in  E  .  Then  E  may  be  written  as 


i^J     ieJ  j=l  ' 'J 


For  each    i  e  J  ,  t.   i  s  of  the  form  t  =  ]T"f .  .   If  the  tree-heights 

of  all  such   t.   are  reducible  by  distribution  using  the  above  algorithm, 

then  so  is  the  tree-height  of  E  . 

Pl     P2 
Finally,  we  consider  expressions  of  the  form  E  =  ( \\f . )/( |p f '.)  . 

All  of  the  tools  have  been  developed  to  evaluate  E  such  that  the 
tree-height  of  E  is  reduced.  The  basis  of  the  following  algorithm 
is  to  balance  the  numerator  expression  with  the  denominator  expression. 
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Extended  Division  Algorithm 

n  m 

Let  Q  =  N/D  where  N  =   Jf.   and  D  =   Jf !  ,  and  let 

i=l    '  i=l    ' 

e  [f  J  <  e  [fj  <    ...    <  e  Cf  1     and     e  m[f]  <...    <  e  Cf'H     during   all 
mlm^  mn  ml  mm 

iterations   of   the  algorithm.      We  assume  that     w     >  w      .      Furthermore, 

d  —  m 

we  modify  the  division  algorithm*  in  the  following  way:  whenever  the 
division  algorithm  selects   f.   to  form  the  quotient  f./f  such  that 

J  J 


E  =   Tf./f  becomes  E'  =  (f./f)  *   Tf.   and  e  [f 3   <   e  [f  ]  and 

J      tir  I        m       m  J 


1  =  1 


»*J 


(f./f)   is  the  only  dominant  factor  in  E'  ,  then  distribute  f 

J 

n  . 

across   f  =  )  t.    if  tree-height  is  reduced.   (c.f.  Definition  2.6) 
J   k=l  J>k 

I)  If  m  =  I  ,  then  use  the  modified  division  algorithm  to  evaluate 
n 

q  =  TTf./f   • 


2)  If 


I  eCfp 

i=l  m    ' 


I  e  Cfi: 


i=2 


,  then  combine  ff  *  f  , 
I    2  ' 


using  distribution  if  f'  is  dominant  in  D,  to  a  single  factor,  thus 
producing  f ",f ~, . . . ,f ", ,  in  accordance  with  the  £  definition.   Hence 
m'  <  m  and  let  e  Cf "H  <  emL!f2^  <  •••<  emCfJtH.   Set  m  =  m'  ,  and  go 
to  step  I . 

3)  If  e  CN/f'J  <  e  [D3  ,  then  replace  N  by  N/f!  ,  using  the  modi- 

m    |     m  i 

fied  division  algorithm,  D  by  D/f!  ,  and  set  m  =  m- I  ,  and 

relabel  all   f   as   f!    .   Go  to  step  I. 
i       i-l  K 

4)  Otherwise,  combine  D  to  a  single  factor  fT  ,  using  distribution, 


*  C.f.  Definition  2.6, 
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n 
where  possible,  for  expressions  of  the  form    ~ff  j  ,  set  m  =  I 

i  =  l  ' 
and  go  to  step  I . 

Intuitively,  the  above  algorithm  iteratively  selects  factors  f 

in  the  denominator  which  will  lower  the  tree-height  of  the  denominator 

when  removed  from  D  and  raise  the  tree-height  of  the  numerator  when 

included  in  N  as  N/f  .  This  process  terminates  when  the  numerator 

and  denominator  are  balanced,  within  certain  bounds,  whereupon  the 

modified  division  algorithm  is  used  to  f i nd  Q  .   Note  that  passing 

the  condition  e  CN/f  H  <  e  CdJ  guarantees  that  tree-height  is  reduced 
m    i     m 

by  at  least  one  and  at  most  w   units  of  time. 

m 

For  example,  consider  the  expression  E  =  a(b+cd) /(e(f+g) (h+i jk) ) 

The  syntactic  tree  of   E  with  w  =  2  ,  w  =  3  ,  and  wj  =  5   is: 

am  d 

E  =    (a  *    (b+c*d))/(e  *    (f   +  g)    *    (h+i*j*k))    =  f]   *   f  /(f j*f »*f • ) 


Since     e  CD/fll  =  e  CDl]    ,      e  *   (f  +  q)      is   combined    into  one  factor, 
m  I  m 
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and  E  becomes  E  =  f  *  f  /(fj  *  f)  .   During  the  next  iteration 

12        12 

ejD/f'J  <   e  CD]    ,    but     elf*   f?/fH  =  2I3/3  >   elM  =   2II/3   . 
m         |  m  m     I         ^     I  —    m 

Hence  the  tree  is  balanced  as  much  as  possible  and  D  is  combined  to 
a  single  factor,  with  no  distribution,  and  thus  no  tree-height  reduc- 
tion   is   possible. 

On  the  other  hand,    consider     E  =  a(b   +  cd)/(e(f   +  gh)(i    +  jk))    . 
Then   the  syntactic  tree    is: 


E  =  (a  *  (b+c*d))/(e  *  ((f+g*h)  *  (i+j*k)))  =  f  *  f  /(f!*f')   . 


0  - 

1  - 

2  - 

3  - 

4  -- 

5  - 

6  -- 

7  - 

8  - 

9  - 
10- 

I  I  — 

12- 

13- 

14- 

15- 

16- 


In  this  case  e  [D/f  J  <  e  [D]  ,  and  e  Lf.*fjf\l  <   e  [D]  ,  so  E 
mi     m  m   I   2       m 


becomes   (a(b+cd)/e)/( (f+gh) ( i+jk) )   and  the  syntactic  tree  is: 
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E   =    ((a/e)    *    (b+c*d))/((f+g*h)(i+j*k))    =   f     *   f^/f 

J  I  2      | 


0  - 

1  - 

2  - 

3  - 

4  - 

5  - 

6  - 

7  - 

8  - 

9  — 
10- 

I  I- 
12- 
13- 


V    rV?V 


o     *  o     ?        o     * 


\/     \/ 


Clearly      hCEU  =    13      is   minimal    in   this   case. 


2.6.      Cone  I  us  ion 

Any  arithmetic  expression  except  a  continued  fraction  may  be  trans- 
formed to  one  of  the  forms  presented  in  this  chapter.   The  algorithms 
given  either  reduce  or  minimize  the  syntactic  tree-height.   Expres- 
sions where  functions  appear  in  place  of  variable  names  are  easily 
handled  by  giving  a  weight,  which  may  be  fixed  or  a  function  of  the 
number  or  type  of  parameters,  to  the  function. 

A  PL/I  program,  which  implements  many  of  the  algorithms  presented 
in  this  chapter,  appears  in  the  Appendix.   The  program  accepts  an 
expression  as  input  in  the  array  EXP  and  produces  MHT,  the  reduced 
tree-height  of  the  expression,  as  output. 

The  format  of  an  expression  such  as  E  =  a  +  (b-(c+d)e)f   in  EXP 
is: 
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19 


0 

1 

2 

3 

4 

5 

6 

•                      •                      •                      • 

159 

+ 

a 

+ 

-1 

* 

f 

> 

+ 

b 

- 

-2 

* 

e 

* 

+ 

c 

+ 

d 

* 

The  negative  entries  point  to  the  row  containing  the  appropriate  fac- 
tor.  Row  zero  always  contains  the  expression  E  ,  while  other  rows 
contain  subexpressions. 

Even  though  the  program  uses  recursive  procedures  extensively,  the 
computer  time  on  an  IBM  360/75  required  to  analyze  most  statements 
found  in  FORTRAN  programs  is  on  the  order  of  one-half  second. 
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3.  TREE-HEIGHT  REDUCTION  FOR  A  SEQUENCE  OF  MATRIX  PRODUCTS 

3. I .  I ntroduct ion 

In  this  chapter  the  time  required  to  form  the  product  of  a  se- 
quence of  conformable  matrices  is  investigated.   Unlike  extended 
scalar  products  where  the  commutative  and  associative  laws  of  arith- 
metic may  be  applied  to  reduce  syntactic  tree-height  (c.f.  Chapter  2), 
only  the  associative  law  may  be  applied  to  a  sequence  of  matrix  pro- 
ducts.  Sometimes,  however,  certain  types  of  matrix  products  reduce  to 
a  scalar  in  which  case  the  commutative  law  may  be  used  as  well  (i.e., 
for  a  scalar  z  and  a  matrix  A  ,   zA  =  Az)  .   Muraoka  and  Kuck  (29) 
have  found  a  method  to  recognize  when  a  sequence  of  matrix  products 
contains  subexpressions  which  reduce  to  a  scalar. 

Furthermore,  where  the  multiply  operation  of  scalars  requires  a 
fixed  amount  of  time,   w   ,  the  multiply  operation  of  matrices  is  a 
function  of  the  matrix  dimensions.   If  a  system  of  parallel  processors 


is  used,  then  the  matrix  product  A  A  ,  where  A   is  dimension 
a  x  a.   and  A   is  dimension  a  x  a  ,  may  be  performed  in 


t  =  w  +  w 
m    a 


loq  a 
y2  I 


time,  or  units,  where  w   is  the  multiply 

m 


weight  and.,  w   is  the  add  weight.   Under  ideal  conditions,  the 

a^a.a^  multiplies  may  be  performed  simultaneously  in  time  w   ,  and 
0  I  2      K       '  K  m 

the  ana9  elements  of  the  product  matrix  may  be  found  simultaneously 


in  time  w 


og2a 
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An  algorithm  is  given  for  association  of  a  sequence  of  matrix 
products  such  that  the  syntactic  tree-height  is  minimized,  and  hence 
the  execution  time  on  a  system  of  parallel  processors  is  also  minimized. 
Furthermore,  if  two  choices  of  association  give  the  same  tree-height, 
but  one  results  in  fewer  computer  operations  than  the  other,  then  it 
is  preferred. 

3.2.   Scalar  Matrix  Product  Subexpressions  and  Canonical  Form 

This  section  summarizes  the  pertinent  work  of  Muraoka  and  Kuck 

m 
(29).   A  matrix  product  expression  E  =     A.   is  considered  where 

i  =  l   ' 

the  matrix  sequence  is  conformable  and  each  A.   is  either  an  n  x  n 
matrix,  an   I  x  n  matrix  (i.e.,  a  row  vector)  which  is  replaced  by  R.  , 
an  n  x  I  matrix  (i.e.,  a  column  vector)  which  is  replaced  by  C.  , 
or  an   I  x  I   matrix  (i.e.,  a  scalar)  which  is  replaced  by  z.  .  The 
syntax  for  E  is  defined  by  the  set  of  productions  shown  in  Table  3.1. 
Included  in  Table  3.1  are  the  transformation  product  weights  as  well 
as  the  number  of  multiply  and  add  operations  using  a  system  of  parallel 
processors. 

Matrix  product  expressions  are  regular,  i.e.,  the  grammar  is  reg- 
ular. The  only  well-formed  instances  of  these  expressions  are  of  the 
forms 
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Number  of 

Number  of 

Production 

Para  1  lei 

Weight 

Mu  1 tipl ies 

Adds 

E  «■  A | R|C | S 

A  <-  CR| 

w 

m 

+ 

w  | 
a1 

a2   ' 

2 

n 

0 

aa| 

w 
m 

+ 

w 
a 

i°g2n  1 

n3 

3    2 

n  -  n 

zA|Az 

w 
m 

n2 

0 

R  +   RA| 

w 
m 

+ 

w  ' 
a 

log2n  | 

n2 

2 

n  -  n 

SR|Rz 

w 
m 

n 

0 

C  +■  AC| 

w 
m 

+ 

w 
a 

log2n  | 

n2 

2 

n  -  n 

zC|CS 

w 
m 

n 

0 

S  «-  RC| 

w 
m 

+ 

w 

a ' 

log0n   | 

n 

n  -  1 

zS|Sz| 

w 

1 

0 

Table  3.1:   Productions  and  Weights  for  a  Matrix  Expression 
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(3.1) 


E  =  A*C(RA*C)*RA  , 

E  =  A*C(RA*C)*  , 

E  =  (RA*C)*RA*  , 

E  =  AA*  ,  or 

E  =  (RA*C)*  , 

where  *   is  the  Kleene  star,  which  means  that  zero  or  more  instances 
of  elements  of  the  previous  type  occur.   For  example 
A*  =  (X  U  A  U  AA  U  . . . )  ,  where  X      is  the  empty  symbol. 

An  expression  E   is  first  scanned  for  instances  of  sub-expres- 
sions which  are  of  type  RA*C  .   It  is  obvious  from  the  fact  that 
S  «-  RA*C  (use  the  production  R  «-  RA  and  S  +   RC)  ,  and  that  the 
scalar  RA*C  may  commute  with  any  element  in  the  expression,  that  one 
may  write 

E'  =  [(RA*C)*HA*CRA*  , 

E'  =  :(RA*C)*I|A*C  , 

E'  =  (RA*C)*RA*  ,  or 

E'  =  [(RA*C)*:]A*   . 

E'   is  said  to  be  the  canonical  form  of  E  .   The  square  brackets  in 

equation  (3.2)  indicate  that  the  contents  between  the  brackets  must  be 

evaluated  separate  from  the  rest,  otherwise  E'   is  not  well-formed. 

(This  is  obvious  since  neither  the  production  CA  nor  CC  occur  in 

Table  3. I.) 

For  example,  suppose  E  =  A  A  C  R  A  C  R  C  R QA   .   Since  both 

R.ArC,  and  R..CL  reduce  to  a  scalar,  then 
4  5  6        7  8 


(3.2) 
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E'  =  [(R4A  C6)(R7Cg)]A  A^A   .   The  algorithm  to  find  an  instance 
of  RA*C   in  an  expression  E   is  obvious. 


3.3.   Conditions  for  Tree-height  Minimization 

Muraoka  and  Kuck  give  an  algorithm  to  find  the  rules  of  association 

of  matrix  products  in  canonical  form  where  the  matrices  are  either  n  x  n 

(square)  matrices  or  vectors  of  length  n  (rows  or  columns)  such  that 

a  balanced  syntactic  tree  results.   This  section  gives  the  rules  of 

m 
association  for  matrix  product  expressions  such  as  E  =   [Aj  ,  where 

i  =  l 

each  matrix  A.   is  of  dimension  a.  .  x  a.  ,  such  that  syntactic 
i  i  -I     i 

tree-height  is  minimized. 

We  first  obtain  the  canonical  form  of  E  and  proceed  by  treating 
each  instance  of  matrix  sub-products  of  the  form  RA*C  separately 
from  the  remaining  product  sequence.   Since  A*CRA*  ,  A*C  ,  and  RA* 
are  generically  all  of  type  A*  ,  where  A  is  not  necessarily  square, 
for  the  purpose  of  discussion,  we  consider  the  canonical  form  of  all 
expressions  to  be  of  the  form  E'  =  [(RA*C)*HA*  .  We  then  show  how  to 
build  balanced  syntactic  trees,  by  presenting  several  lemmas,  such  that 
tree-height  is  minimized.  These  lemmas  are  based  on  the  technique 
used  in  Chapter  2  where  it  was  shown  that  the  discrete  combination 
method  minimized  tree-height  (for  monolithic  subexpressions)  by  minimi- 
zing slack,   (c.f.  Theorem  2.1.)  Also,  if  either  of  two  parsing  options 
produce  the  same  tree-height,  then  it  is  shown  how  to  select  the  one 
such  that  fewer  total  computer  operations  (adds  and  multiplies)  are 
requi  red. 
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Def  i  ni  tion  3.1:   Let     n  "|  =  2  ,  where  p   is  the  smallest  integer 
such  that  n  <_  2^  .   (c.f.  Definition  2.4  of     "|   .   In  this  case, 
there  is  only  one  coset-group  and  hence,  the  subscript  I.) 

Def  i  ni tion  3.2:   Let  w   be  the  weight  of  transformation  T  ,  where 

T  is  either  some  matrix  A.   from  the  matrix  product  expression, 

E  =    A.  ,  or  T  is  a  product,  or  composed,  of  two  matrices  (T.T.,,)  . 


The  value  of  w   is  given  by  the  following  obvious  I 


emma, 


Lemma  3.1:   If  T  =  T.T...  ,  where  T.   is  a  t.  .  x  t.  matrix  and 
i  i+l  i         i-l     i 

T. , .   isa  t.  x  t.    matrix,  then  wT  =  w_  +  w_ log0f  t.  1 ,  .   If 
i  +  l         i     i  +  |  T    m    aa2'    i'l 

T  =  A   ,  then  w  =  0  . 
i         T 

Def  i  ni  tion  3.3:   Let  hDG  be  the  minimum  syntactic  tree-height  of 
the  transformation  T  . 

The  value  of  hD~U  is  given  by  the  following  obvious  lemma. 


Lemma 


3.2:      If     T  =  T.T  ,    then     h[T]  =  max(h[j  J,hCT- .J)    +  wT    .      If 
i    i+l  i  i +i 


T  =  A.    ,    then      h[Tj  =  0    . 

Now  that  the  notation  has  been  presented,  we  return  to  the  prob- 
lem of  evaluating  the  product  of  a  matrix  sequence,  which  is  in  canon- 

m 
ical  form,  i.e.,  we  wish  to  calculate  the  matrix  E  =   ~|~A.  .  This 

i  =  l  ' 

must  be  performed  by  iteratively  calculating  transformations,  i.e., 
reducing  two  matrices  to  one,  until  only  one  matrix  remains. 
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Furthermore,  a  transformation  T.  may  only  be  paired  with  its  neighbor 

to  the  left  or  right,  i.e.,  T.  .   or  T.    ,  respectively.   Lemma  3.2 

k 

gives  the  value  of  hCEj  when  E  =  T.T^  or  E  =  T.  .  When  E  =     T. 

12  I  ill 

i  =  l 

and  k  >_  3  ,  the  association  rules,  or  pairing  strategy,  to  produce 

hL~E3  are  not  immediately  obvious. 

However,  Lemma  3.2  shows  that  the  formation  of  matrix  products  is 
still  a  discrete  combination  process,  comparable  to  the  process  de- 
scribed in  Chapter  2.   It  will  be  shown  that,  similar  to  Chapter  2,  a 
combination  strategy  which  minimizes  slack  for  matrix  product  expres- 
sions also  minimizes  syntactic  tree-height. 

Let  us  apply  the  definitions  of  Section  2.2  to  the  matrix  product 
problem.   Some  simple  algebraic  manipulation  may  be  used  to  prove  the 
f o I lowi  ng  lemma. 


Lemma  3.3:   If  E  =  T  T   and  h\J \1   <  h[T  1  ,  then 

w  w 

eCE:=   f    t      1      a<emtT,]+em[T2:+s|)2>  m 


Reca I  I  that  s.     is  the  s I ack  between  the  subtrees  for  T   and 
•  > z- 

T0  .  The  significance  of  the  expression  (eJOT  ,3   +  e  LTrJ  +  s.  0)  = 
2         a  K  nv-  |     m^  2     1 ,2 

2e  D"9H  may  be  seen  immediately  from  the  following  illustration: 
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e  [Tl m s         m- 

m      |  1 ,2 


m     2 


Note  that    in  Chapter  2,    for  a  scalar  expression     E  =   f   *f       of 

monolithic   factors  where     hCf.U  <_  hCf oil    ,    it  was   shown  that 

w 

e[EU  =  (e  Ef  I  Zl  +  e  CfoJ  +  s   )    ,  which  differs  only  by  a  coefficient 

from  the  corresponding  expression  for  a  matrix  product.  The  following 
lemma  is  obvious. 

Lemma  5.4:     t   |    >  I  . 


Let  us  examine  this  coefficient  more  carefully.   It  is  a  function 

of  the  common  dimension  of  every  pair  of  matrices  in  the  expression 

k 
E  =  1~T.  ;  i.e.,  for  every  pair  of  matrices  T.T.  .  (i  =  l,2,...,k-l) 

t.   is  the  common  dimension.   No  matter  what  transformation  pairing 

strategy  is  used  to  evaluate  E  ,  every  transformation  weight 

r     iwa 
+  I og„ |   t.     ,  must  appear  somewhere  in  the  syntactic 


wT     =  w 
Vi  +  I    m 


'2'    i   '| 
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r_  w 
t     a  i  n  Lemma  3.3  is  rea I  I v 
I    | 

unimportant  if  we  wish  to  show  that  eCEj  is  minimized.  This  fact 

is  forcefully  demonstrated  in  the  following  analysis,  where  it  is 

shown  that  eCEU   is  minimized  only  when  the  correct  pairing  strategy 

is  used. 

Recall  that  e  \J ,1  <   em[T  1   in  E  =  T.T„  .  Since  Tn  has  the 
m   I   —  m  2  12  2 

higher  tree-height,  and  since  T   is  a  product  of  two  matrices,  which 
we  ca  I  I  Toi  and  T~„,  let  us  spj_i_t  T  into  its  components,  i.e., 

T„  =  T0.T00,  and  let  us  assume  that  e  D"olH  <  e  D"00H  .  Then 
2    21  22'  m  21   —  m  22 

E  =  T.T-.T--  and  we  may  write 


m 


"iwa/w 


£emCT,D+ 


21 


w  /w 
a  m 


(em[T2|>em[T22>s|2^2)+s^2}  ,   (3.3) 


if  and  only  if  e  \JT  ,1   >  e  D"  H  .   (In  other  words,  the  proper  pairing 


rrr   I-*  —  "m^'22 


strategy    is,    in   fact,     T   (T     T     )      rather  than      (T.T2,)T22.)      in   the 

proof   by  contradiction,   we  suppose     e  L~T.H  <   e  L~T„,-J     and   then  show 
'  rr  m      I  m     22 


emL~EU     given   by   equation    (3.3)    is    not  minimal.      Let 

r  iw  /w 

a     m 

and   c„ .  =   t„,   ,     .   Then  we  claim  that  for 


I 


t 


w  /w 
a'  m 


E'  =  (T,T2,)T22 


em:E']  =  ^l{^22p^l(em^i:HP,2i:j#9l.2l)*22J2|} 


(3.4) 


where  T._,  =  T.T  ,  ,  has  a  smaller  value  than  e „TeD  qiven  by 
121     I  21  ™  a 


59 


equation  (3.3).   Let  us  replace  the  slack  variables  by  their  defini- 
tions, expand  and  rearrange  terms  on  equations  (3.3)  and  (3.4)  and 

write, 


and 


e  CEj  =  c.e  CT0H  +  2c,c  (e  CT9J)  , 
m       \    m     z  i  z I   m  zz 


(3.5) 


em[Ef:  "  C2.emCT|2l>2clC2. 


e  ETiH  ,  or 
ml' 


(3.6) 


respectively.   Now  we  may  compare  these  equations  term-by-term.   Since 
e  [T  1  <_  e  CT9?H  by  assumption,  and   eLX.D  <  e  [T  ]  by  supposition, 


m1-  21 


m1-  22- 


and  since  e„TT.-J  =  2c„.e  D~00H  by  definition,  then 
m1-  2      21  m  22    7 

e  [T9J  >  c01e  ET.^,1]  .   Finally  since  c,  >  I  ,  the  claim  is  obviously 
m  Z  — '    2 1  m   121  I  — 

true.  One  could  similarly  show  that  e  CE'Il  <  e  [EH   implies  that 

'  m       m      r 

e  DM1  <   e  DMH  .  Thus  the  followinq  theorem  has  been  proven. 
m      I-1    m  22  3  K 

Theorem  3.1:   Let  E  =  T,T  T   and  assume  that  h[T,],  hp~  [],  and 
12  3  I       2 

h[T_]  are  in  fact  the  minimum  transformation  tree-heights  and  let 
h[T,]  >_  h[T  3     and  h[T J  <  h[T  ]  .  Then  hDjT^D   is  achievable  if 
and  only  if  the  pairing  strategy  which  minimizes  slack  is  used. 


When  E  =   ~|~T   and  k  >  3  ,  by  recursive  application  of  Theorem 
1  =  1  ' 

3.1  in  a  top-down  fashion,  we  find  that  slack  is  minimized  only  when  the 

pair  T.T.    is  selected  for  a  transformation  association  where  the 
i  i  +  l 
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composite  transformations  have  tree-heights  smaller  than  any  other 

pair.   That  is,  if  h.  =  h[Tj  +  h[T.±l]  ,  then 

J      J       j+l 

h.  =  min{h.|j  =  I ,2, . . . ,k- I } .   Before  attacking  the  remaining  analysis 
for  cases  where  k  >  3  ,  let  us  first  complete  the  case  for  k  =  3  . 
There  are  two  cases  to  investigate.   If  h[J  1  is  larger  than 
h[T  ]  and  h[J_]  ,  then  we  may  wish  to  investigate  whether  or  not  to 
split  T„   into  two  parts,  T„.   and  T   ,  so  that  T.T    and 
T22T,  may  be  formed.   Secondly,  if  all  tree-heights  are  the  same,  then 
even  though  hp",  (T~TO[]  =  hL~(T,T„)T,,[]  ,  one  may  be  preferred  over 
the  other  because  fewer  total  computer  operations  (adds  and  multiplies) 
may  result.  The  proof  of  the  following  lemma  is  similar  to  the  proof 
of  Theorem  3.1. 

Lemma  3.5:   Let  E  =  TTT  .   If 

wT  T   +h[T_]+wT  T  >  w  +max(h[T.H+w    ,h[T  ]+w    )  ,  then  split  T   into 
12    l  2  3    '2       '    V2        2*3  l 

its  components  T~ .   and  T~~  and  hCT.T^T^U  =  hC(T.T7, )  (T«^T,)I]  . 

Lemma  3.6:   Let  E  =  T  "T  T   and  hCTjH  -  h[J2]  =  h[T3H  .   If 

t0(w   t„+w   t.("h.-t,))    <   t,(w  t.+w   tn(t.-t0))    ,    then      E  =    (T,T0)T,     gives 
2a0ml03  3almOI2        '  I    2     3     a 

fewer  computer  operations  than  E'  =  T.(T  T  )  . 

The  proof  of  Lemma  3.6  is  easily  shown  by  the  fact  that  E  requires 
t0t2(+l++3)  multiPlies  and  t0(t2+t3)+t|+t2-2  adds  and  that  E'  re- 
quires t.t,(tn+t7)   multiplies  and  t-,(t~+t .  )+t„+t  .-2  adds. 
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k 
Finally,  two  more  questions  when  k  >  3  ,  E  =  "|~T.  ,  are  resolved. 

Firstly,  if  a  string  of  transformations  have  the  same  tree-height,  and 

each  pair  is  a  candidate  for  association,  how  should  the  correct  pair 

be  selected?  Secondly,  since  the  transformations  T,   and  T   can 

1        k 

only  be  associated  with  J?     and  T    ,  respectively,  should  these 
pairings  be  given  special  consideration?  The  first  question  is  answered 
by  the  following  lemma. 

Lemma  3.7:   Let  h .  =  h[T  ]  +  h[T   ]  ,  for  j  =  I ,2, . . . ,k- I  ,  and 
J      J       j+l 

J  =  {j|h.  =  mi  n(h  ,h~, . . . ,h   .)}   .   If   |j|  =  2  ,  then  apply  Lemma  3.6 

J 

to  select  either  the  smaller  or  the  larqer  p  e  J  and  pair  T  T 

P  p+l 

If   |j|  >  2  then  select  the  smallest  p,  e  J  and  the  largest  p  e  J 

and  make  both  associations  T  T   ,   and  T  T   ,  . 

P|  P|+l       P2  p2+l 

Clearly,  this  strategy  ensures  that  the  largest  number  of  pairings 

is  achievable  at  that  level,  that  the  syntactic  tree  is  balanced  on 

both  ends,  and  that  slack  will  be  minimized.   Note  that  the  applica- 

9 
tion  of  Lemma  3.7  on  E  =  "]~A.  ,  for  example,  gives  us 

E  =    (A.A„)A  A   A  A  A^(AnAn)    =  T ,T0T^T .T^T^T^    .      Now  we  answer  the  second 
12      34567      89  234567 


question 


k 


Lemma  3.8:   Let  k  +   3  and  E  =  "Jt.  .   If  h[T  ~\   <_  h[T  ]  ,  then  assoc- 
iate T,T  .   Similarly,  associate  T,  ,T,   if  hD",  ~\  >   h[X  U  . 
12  k- 1  k        k- 1  —    k 

Clearly,  since  T.   can  only  be  paired  with  T  ,  the  height  of 
T,   is  fixed;  whereas  the  height  of  T   may  become  larger  if  T   is 
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paired  with  T,  .   In  other  words,  unless  T.T?  are  paired,   s. 

can  only  grow  in  size.   By  Theorem  3.1,  associating  T  T   leads  to 

minimum  tree-height  of  E  . 

m 
Thus,  all  the  conditions  necessary  to  parse  E  =    A-   such  that 

syntactic  tree-height  is  minimized  have  been  established. 


3.4.   The  Sequence  of  Matrix  Products  Parsing  Algorithm 

In  this  section,  an  algorithm  to  parse  a  sequence  of  matrix  pro- 

m 
ducts  in  a  bottom-up  fashion,  such  as  given  by  the  expression  E  =  ~]"a. 

such  that  syntactic  tree-height  is  minimized,  is  presented.  The  algo- 
rithm summarizes  the  results  derived  in  Section  3.3. 

k  m 

1)  Let  E  =    T.   represent  the  expression  E  =    A.   (i.e., 

i=l  '  1=1 

T.  =  A.   and  k  =  m)  ,  and  let  wT   be  the  weight  of  transfor- 

ii  T.  3 

i 

mation  T.  . 

i 

2)  If  k  =  I  ,  then  STOP. 

3)  If  k  =  3  ,  then 

3.1)  If  h[T  ]  =  h[T2J  =  h[T]  ,  then  associate  either   (T  T  ) 
or  (T  T  )  depending  upon  which  leads  to  the  smaller  amount 

of  processor  time  (c.f.  Lemma  3.6),  set  k  =  k- I   and  let 

k 
E  =    T.  ,  with  appropriate  relabeling  of  the  T.  .  Go  to 

i  =  l  '  ' 

Step  4. 

3.2)  If  w     +  h[T  1  +  wT  T  >  wT  +  max  (h[T J   +  w     , 

1  I  ]2  2  2  3     2  '  I  l2 

hp",H  +  w    )  ,  then  split  T   into  T_.   and  T   , 
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associate  (T  T  )   and   (T  T  )  such  that  E  = 
I  21         22  3  k 

(T  T   )(T  T  );  set  k  =  k  -  I  ,  and  let  E  =  TTt.   with 
I  21   22  3  '  '  I 

appropriate  relabeling  of  the  T.  . 

4)  If  h[T.H  <_  hD"2H  ,  then  associate  M  =  (T.T„)  ,  replace  the  two 

transformations  T   and  T   in  the  matrix  sequence,  with  M  , 

k 

set  k  =  k- I   and  relabel  the  T.   such  that  E  =  TY   .   Go  to 

i  '  '  i 

step  2. 

5)  If  hCT._,H  >_  h[T,H  ,  then  associate  M  =  (T   T  )  ,  replace 

the  two  transformations  T,    and  T,   in  the  matrix  sequence, 

k- 1        k  *i     » 

with  M  ,  and  set  k  =  k- 1  .   Go  to  step  2. 

6)  Calculate  h.  =  h[T.I!  +  h[T\  J  for   i  =  l,2,...,k-l  ,  the  set 

i      i       i  + 1 

J  =  {ilh.  =  mi  n(h  .  ,h0, . . .  ,h.  .)}  ,  and  determine  the  smallest 
1  j        12      k- 1 

p,  e  J   and  the  largest  p  e  J  . 

6.1)  If   |j|  =  I  ,  then  associate  M  =  T  T    ,  eliminate  T    , 

11  P  p+l  p+l 

replace  t   .   with  M  in  the  matrix  product  sequence,  set 

k 
k  =  k-l   and  let  E  =  7TT-   Go  +°  s+eP  3- 

i  =  l  ' 

6.2)  If   |j|  =  2  ,  and  p?  -   p,+l  ,  then  apply  Lemma  3.6  to  sel- 
ect the  appropriate  transformation  (i.e.,  either 

T  T      or  T  T   , . )   for  association  as  in  step  6.1. 
P,  P,+l       P2  P2+l 

6.3)  Otherwise,  associate  M,  =  (T  T   ,.)   and  M„  =  (T  T^  ,.) 

I     P,  P|+l         2     p2  P2+l 

replace  the  two  transformations  T    and  T   , ,  in  the 

P.        Pi+I 
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matrix  sequence  with  M,   and  similarly  T    and  T 

I  P2       P2+l 

k 
with  M„  ,  set  k  =  k-2  ,  and  let  E  =   |T.   with  approp- 

i  =  l 
rlate  relabeling.  Go  to  step  4. 


Let  the  vector  a  =  (a_, a ......  a  )   represent  the  ordered  list  of 

0   1      m     v 

the  common  dimensions  of  A.  (i.e.,  a.   x  a.)  .   Fiqure  3.1  demon- 

i         i  - 1     i 

strates  an  application  of  the  parsing  algorithm  on  a  sequence  of  matrix 

products  which  is  in  the  form  A*CRA*  .   (A.   is  a  column  vector  and 

A   is  a  row  vector.)  The  symbol   =*=  means  that  many  multiplies  occur 

simultaneously,  at  the  indicated  level  and  the  symbol  yf   means  that 

many  additions  occur  simultaneously  at  the  indicated  level  and  each  is 

a  culmination  of  additions  performed  by  a  binary  tree.   For  example, 

see  (A.A?)  in  Figure  3.1;  at  level  3,  since  a~   =  9,  a  =  6,  and  a~  =  4, 

216  multiplies  are  completed  in  w  =  3  units  of  time,  and  at  level  9, 

the  36  elements  of  the  transformation   (A  A  )  are  completed  in 

w  loq  f  a.     =6  additional  units  of  time,  since  w  =2  and 
a   21    I   '  I  a 

a,  =6. 

Clearly,  the  algorithm  is  also  applicable  for  extended  matrix  pro- 
duct expressions  which  are  of  the  form  RA*C  .  After  each  component 
of  the  canonical  form  E'  =  E(RA*C)*!]A*   is  determined  by  the  parsing 
algorithm,  the  scalar  products  at  the  end  are  found  in  accordance  with 
the  rules  established  in  Chapter  2. 
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E  =        ((A  A    )  ((A  A    )  A    ))  ((A  A   )  (A  A    )) 

12  34  5  67  89 


Figure  3.1:      E   =     "[J  A.    , 


a   =    (9,6,4,3,1,8,15,3,6,9),      w      =   3,      w      =2. 

m  a 
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4.  SCHEDULING  FOR  A  WEIGHTED-NODE  DIRECTED  GRAPH 


4. I .   I ntroduct ion 

The  parsing  algorithms  developed  in  Chapter  2  produce  weighted- 
node  trees.   If  common  subexpressions  are  combined,  then  a  weighted- 
node  graph  results.  Hence  it  is  imperative  that  a  scheduling  algorithm 
be  developed  and  a  least  upper  bound  (LUB)  on  the  number  of  machines 
needed  to  complete  the  job  be  determined.   In  this  chapter,  algorithms 
are  presented  which  provide  solutions  to  these  problems. 

Some  of  the  work  described  here  is  based  on  a  paper  by  Hu  (12), 
where  an  optimal  scheduling  algorithm  for  unit-weighted-node  trees  is 
given.  The  rest  is  based  on  the  critical  path  method  used  by  PERT,  as 
described  by  Kauffman  (18).  The  work  in  this  chapter  was  developed  inde- 
pendently of,  and  simultaneously  with,  that  done  by  Ramamoorthy,  et  al. 
(31).   The  significant  difference  is  that  a  greater  lower  bound  on  the 
number  of  machines  required  is  determined  here  and  an  optimal  scheduling 

algorithm  for  any  number,   k  ,  of  machines  is  presented,  which  requires 

2 

0(n  )   computer  operations*  to  complete. 

Schwartz  (34)  also  developed  a  scheduling  algorithm  for  a 
weighted-node  graph.   It  is  similar,  in  concept,  to  Hu's  algorithm  in 
that  the  nodes  on  the  longest  queue  are  processed  first.   Schwartz's 


*  We  define  computer  operations"  to  be  the  operations  add,  subtract, 
compare,  etc.  which  are  in  the  instruction  repertoire  of  most 
computers . 
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algorithm,  however,  is  only  sub-optimal  and  is  presented  in  this  chapter 

2 

as  algorithm  3.  While  it  is  also  an  0(n  )  computer  operations  algo- 
rithm, it  is  about  twice  as  fast  as  the  optimal  algorithm. 

We  are  concerned  with  scheduling  atoms  of  work,  or  tasks,  in  a 
non-preemptive  manner  on  a  set  of  machines  where  any  one  machine  is 
capable  of  performing  any  task.   Later  the  problem  of  scheduling 
special  purpose  machines,  of  several  types,  is  discussed  and  a 
scheduling  solution  indicated.   It  is  also  assumed  that  the  amount  of 
work  (or  time)  for  each  node  is  known  a  priori  and  that  information 
transfer  among  the  machines  requires  zero  time.   Finally,  it  is  assumed 
that  the  directed  graph  describing  the  job  is  reduced;  i.e.,  that  all 
cycles,  or  circuits,  in  the  original  job  description  have  been  reduced 
to  a  s  i  ng le  node. 

Nodes  in  the  reduced  graph  which  represent  circuits  in  the  original 
graph  are  analyzed  separately.   For  example,  suppose  that  some  FORTRAN 
program,  which  contains  DO- loops  or  IF- loops,  is  the  job  to  be  analyzed. 
These  loops  are  represented  as  circuits  in  the  graph.  Analysis  is 
performed  by  expanding  the  loops  until  they  are  loop-free,  in  which 
case  they  may  be  represented  by  a  reduced  sub-graph.   In  other  words,  a 
circuit-free  graph  may  be  produced  either  by  reduction  or  expansion. 

4.2.   Lower  Bound  on  the  Number  of  Machines  Required 

We  wish  to  determine  the  minimum  number  of  machines  that  are  re- 
quired to  process  some  job  in  minimum  time.  The  job  consists  of  a 
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certain  set  of  tasks,  whose  order  dependency  may  be  described  by  a 
directed  graph,  G  .   Each  task  is  represented  by  a  node  of  G  ,  the 
dependency  is  represented  by  directed  arcs  between  nodes,  and  the  time 
to  complete  a  task  is  the  node-weight.   It  is  assumed  that  these 
weights  are  integers.  All  circuits  in  G  are  reduced  to  a  single 
node  or  expanded  to  obtain  an  acyclic  graph.   In  all  cases  during  the 
following  discussion,  let  there  be  n  nodes  in  G  . 

Def  i  ni  tion  4.1;   A  starting  node  is  a  node  with  no  predecessors  and  a 
terminal  node  is  a  node  with  no  successors. 

G  may  contain  several  nodes  of  each  type,  and  G  may  be  separ- 
able into  two  or  more  independent  subgraphs.  Also,  if  n   is  a  pre- 

j 

decessor  of  n.   then  write  n.  )>  n.  ,  which  is  equivalent  to  sayinq 
i  j    i 

that  n.   is  a  successor  of  n   .or  that  n  <^  n.  . 
1  J  i    J 

Definition  4.2:   Let  G   be  the  relaxed  graph  of  G  .  G^  is  defined 
R — L-  R 

in  the  following  way.   All  terminal  nodes  in  G  are  placed  in  the  low- 
est level,   q  ,  of  G  .   All  nodes  in  G  which  have  successors  only 

R 

in  the  set  of  terminal  nodes  are  placed  in  level   q- I  .   Repeat  this 

process  in  decreasing  level  order  j  ,  so  that  all  nodes  in  G  which 

have  successors  only  in  the  set  of  nodes  in  levels  greater  than  j 

are  placed  in  level   j  of  G  .  When  the  unassigned  nodes  in  G  are 

all  starting  nodes,  place  them  in  the  next  level,  thus  establishing 

the  value  of  q  so  that  the  levels  of  G   are  labeled  l,2,...,q  . 

R 
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Include  all  connecting  arcs  in  G   in  accordance  with  those  arcs  in 

R 

G  .   Each  node  in  G   is  said  to  be  tightly  connected  to  some  terminal 

R 

node.  G   is,  of  course,  isomorphic  to  G  . 

R 

For  each  node,   n.  ,   i  =  l,2,...,n,   let  W.  =  w.  +  max{W . | n .  <  n.}, 
where  w.   is  the  weight  of  n.  .  Wj   is  the  largest  node-weight-sum 
of  all  paths  between  n.   and  the  set  of  terminal  nodes.   Furthermore, 

Let  D  =  max  {W. In.  e  G}  be  the  critical  path  value,  or  critical  time 
q         II 

of  G   ,  i.e.,  the  least  amount  of  time  in  which  the  job  described  by 

R 

G  may  be  completed,  and  D   is  achievable  when  an  arbitrarily  large 

number  of  machines  is  available. 

Suppose  we  can  cut  GR   in  two  pieces  at  some  level   j  such  that 

H.  consists  of  levels  I  through  j  of  G  ,  and  H'   is  the  graph  of 
J  R        j 

the  remaining  levels,  j  +  I   through  q  .  Then  D.   is  the  critical 

time  of  H.  .   Let  J  =  {kin  e  H.}  and  P.  =  7  w.  ,  i.e.,  the  sum  of 
J  k J        J    ieJ  ' 

node  weights  in  H.  .  The  following  theorem  produces  a  lower  bound  on 
the  number  of  machines  required  to  process  the  job  in  time  D 

Theorem  4.1:   If  m- I  <  max{P./D  I  i  <  q}  then  at  least  m  machines 

I       i  '      —  ^ 

are   required   to  process   all    nodes    in     D       time. 

Proof:      Let      j      be  a   value  of      i      such   that     P./D.    =  max{P./D.|i    <   q}    . 

J  J   J        i   i  '   -  H 

Since  GD  is  relaxed,  so  is  H.  .  At  time  D.  the  total  mass  removed, 

K  J  J 
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(m-l)D.  ,  is  less  than  P.  by  assumption.  Since  H.   is  relaxed, 

\J  J  J 

there  is  at  least  one  node,   n.  e  H.  ,  which  is  not  completely  pro- 

K    J 

cessed;  let  w/  >  I   be  the  unprocessed  portion  of  n  .   Let  D' 

k  -  k         q_j 

be  the  critical  time  of  H'  .  Hence  the  total  time  to  process  G_ 

J  R 


is  T  >  D.  +  w'  +  D' 

~  J    k    q-j 


Si  nee  D.  +  D'  .  >  D  ,  then 

j   q-j  -  q 


T  >  D  +  w,'  >  D  .  Therefore  it  is  impossible  to  process  all  nodes  in 

-  q    k    q 

D   time  with  m- I  machines. 
q  Q.E.D. 

Theorem  4.1  gives  a  lower  bound  on  the  number  of  machines  required 
which  is  not  smaller  than  the  lower  bound  found  by  either  of  the  methods 
of  Hu  or  Ramamoorthy.   However,  we  might  ask,  "Does  the  number  of  com- 
putations required  to  calculate  m  preclude  its  use?"  For  this 
reason,  and  because  the  W.   are  required  for  later  parts  of  this 

chapter,  algorithms  are  presented  which  find  W.  ,  D.  ,  and  P.  . 

J         J 

One  could  use  the  Bel Iman-Ka laba  (18)  algorithm  to  find  the  set 

2 
W  =  {W,,...,W  }  but  this  requires  0(kn  )  operations  where  k  is  the 
In 

number  of  iterations  required  for  convergence.   If,  on  the  other  hand, 

one  starts  with  a  relaxed  graph  which  is  represented  in  list  form 

(i.e.,  associated  with  each  node  is  a  list  of  its  successor  nodes), 

then  the  equation  W.  =  w  +  max{W.|n.  <n.}  is  used  by  starting  with 

i     i        i  i  j    i 

the  terminal  nodes  in  level   q  ,  where  W.  =  w.  ,  then  the  nodes  in 
level   q-l  ,  and  so  forth  until  all  nodes  are  thus  processed.   In  this 

case  the  set  W  is  computed  in  0(kn)  operations,  where  no  node 

2 
has  more  than  k  successors.  Of  course,  0(n  )  operations  are 
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required  to  produce  the  relaxed  graph  list,  but  this  is  a  penalty  that 

need  only  be  paid  once  during  the  computation  of  m  from  Theorem  4.1. 

Let  S.   be  the  set  of  nodes  in  level   i  of  G  , 
i  R 

K  =  (kin,  e  S.}  ,  and  p.  =  ]   w.  .   Then  P.  =  p.  +  P.    ,  where 
1  k    i         '   JeK  J  J    J    J-1 

P  =  0  .   This  equation  suqqests  that  the  D.   be  calculated  in  the 

0  J 

sequential  order  D  ,D  ,...,D   .   Let  N  J   =   I  I  S .    .   Then,  since 

N  ^  <_  n-(q-j)  ,  the  set  of  D.'s  may  all  be  computed  in  no  more  than 
qO(n)  operations,  while  the  set  of  P.'s  may  clearly  be  computed  in 
0(n)  operations. 

If,  on  the  other  hand,  the  graph  G   is  represented  as  an  inci- 

R 

dence  matrix,  B  ,  of  ones  and  zeroes  where  b. .  =  I  implies  that 

•J 

2 

n.  >  n.  ,  then  the  set  of  D.'s  require  at  most  q0(l/2  n  )  word- 

J  J 

operations  if  B  is  upper  triangular.   Representation  of  B  as  a  bit 
array  on  computers  with  a  powerful  set  of  binary  instructions  could 
further  reduce  the  order  of  operations  required. 


4.3.   Scheduling  Algorithm  for  k  Machines 

Scheduling  of  machine  utilization,  or  assignment  of  nodes,  which 
represent  work  to  be  done,  to  machines  is  essentially  a  mapping  problem, 
That  is  to  say,  one  could  assign  a  different  machine,  taken  from  an 
arbitrarily  large  supply,  to  each  node  of  the  graph  and  thus  complete 
the  job  in  the  critical  time,  C   .  However,  it  is  unrealistic  to 
consider  an  arbitrarily  large  machine  resource,  especially  since  each 
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machine  assigned  is  idle  most  of  the  time.   So,  one  could  begin  sharing 
the  machines  among  certain  nodes;  the  machine  space  is  thus  reduced, 
but  now  one  must  consider  the  dimension  of  time  as  well.  One  obvious 
mapping  strategy  would  be  to  assign  one  machine  to  a  I  I  of  the  nodes 
on  a  critical  path,  thereby  still  ensuring  job  completion  in  the 
critical  time,  C  . 

Let  a  be  a  mapping  procedure,  a:M  X  T  -*■  M  XT  (i.e.,  a  mapping 
from  the  time-space  domain  of  arbitrarily  many  machines  onto  the  time- 
space  range  restricted  to  k  machines)  of  the  tasks  on  an  oriented, 
reduced  graph  G  such  that  the  job  described  by  G  is  processed  in 
the  least  possible  time  ^(k)  .   In  other  words,  a  is  an  optimal 
scheduling  algorithm  for  k  machines.   Let  3  be  some  other  scheduling 
algorithm  for  k  machines  such  that  the  job  described  by  G  is  com- 
pleted in  time  0,(10  . 

Lemma  4.1:  0  (k)  >_fi  (k)  >_  CQ  . 

Definition  4.3:  Let  a  time-slot  in  Mk  X  T  of  a  node  n.   be  the 
period  of  time  w.   that  a  particular  machine  consumes  while  processing 
node  n.  .   Let  y     be  any  mapping  procedure,  y:M   X  T  -»•  M  XT,  and 
Wj   to  be  the  time,  as  measured  in  M  XT,  between  the  beginning 
of  the  time-slot  of  n.   and  the  end  of  the  maximum  time-slot  of  the 
set  of  terminal  nodes.  The  prescript  y      indicates  that  it  is  a 
function  of  the  mapping  procedure  y    . 

Since  the  largest  node-sum-weight,  W.  ,  represents  a  lower  bound 
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on  the  time  between  n.   and  job  completion  time,  the  following  lemma 
fol lows  di  rect ly . 


Lemma  4.2:    W.  >  W.   for   i  =  l,2,...,n  . 
Y  i  -  i 

The  job  completion  times  are  defined  as  follows: 

G  (k)  =  max(  W|fyW2,...,YWn)  ,  and  (4.1) 

«,(k)  =  max(  W. ,  W0,...,  W  )  .  (4.2) 

f  a  I  a  2     an 

Furthermore, 

W.  >  w;  +  max  (  W.ln.  e  G  ,n  >  n.}  .  (4.3) 

Y  j  -  J        Y  l   i       j    i 


The  relationship  (4.3)  suggests  that  a  mapping  procedure,  y  >    be 
accomplished  in  stages.  As  each  node  is  assigned  to  some  time-slot  in 
M  XT,  the  bounds  on  the  time-slots  which  are  available  for  its  pre- 
decessor or  successor  nodes  are  determined.   Let  G.   be  the  graph  of 
one  of  these  stages.   Structurally,  G.   is  isomorphic  to  G  and 

consists  of  a  subgraph  G     which  is  the  set  of  nodes  already  assigned 

k  A,! 

to  M  XT,  and  the  remaining  graph  G!  .   See  Figure  4.1.   For  each 

G.  ,  one  could  determine  a  critical  path  through  G.   and  a  critical 

time,  C.  . 
i 

Since  the  critical  path  of  a  graph  is  the  set  of  nodes  which  most 

urgently  require  processing  when  their  time  has  come,  and  since  G  . 

A,  i 

represents  the  space  where  work  may  be  performed  by  the  k  machines, 
the  following  definition  is  presented. 
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Figure   4.1:      G.    ,    where     GA    .    =  M     XT     and      G!    =  G.    -   G 
a  i  A, i  i|  A,i 
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Def i  ni tion  4.4:  The  critical  path  of  G.   passes  from  G|   into  the 

top  and  out  the  bottom  of  G    . 

A,  l 

This  may  be  achieved  in  any  way  possible;  for  example,  if  n. 

is  the  last  node  on  the  critical  path  not  in  G .  .  ,  then  it  is  neces- 

A,  i 

sary  to  find  a  time-slot  in  G„  .   which  is  not  smaller  than  w.  , 

A, i  J 

but  such  that  for  all   n,  <  n .    W,   is  not  greater  than  the  time  at 

k   j'  y  k 

the  end  of  the  time-slot.  Hence,  the  critical  time  C.  >  C.  .  . 

i  —  i- 1 

Let  d  =  C.  -  C.  ,  . 
i     i     i-l 

Lemma  4.3:   Let  G    ,  determined  at  the  final  iteration,   f  ,  of 

A,f 

procedure  y    ,  be  isomorphic  to  G  .  Then, 

f 

(k)  =  C  =  C_  + 
f 


0  (k)  =  C  =  Cn  +  V  d.  . 
f    0    .*•.  i 


f 
Lemma  4.4:  0  (k)  =  ft,(k)   if  and  only  if   yd.   is  minimized. 
f       f  ?  =  |  ' 


Lemma  4.4  defines  the  conditions  which  would  make  a  scheduling 
algorithm  optimal;  lemmas  4.2  and  4.3  and  equation  4.3  indicate  that 
an  algorithm  which  starts  with  the  terminal  nodes  would  facilitate 
computation. 

The  algorithm  a  ,  which  is  given  below,  performs  a  look-ahead 

at  each  iteration  of  one  level  in  G.   to  see  if  the  critical  path 

i 

splits  and  thus  reserves  v  machines  for  this  purpose.   If 

all  node  weights  w.   are  of  the  same  order  of  magnitude  (i.e., 

max  {w.  |  i  <_n}  <  2  *  min  {w.|i  <_n})   one  level  look-ahead  is  suf  f  icient, 
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but  not  conversely.   If  more  than  one  level  look-ahead  is  required, 
then  appropriate  changes  to  a  would  be  required  at  steps  5,  6  and  7. 

Algori  thm  a:  Assume  that  an  n  x  n  incidence  matrix  B  ,  where 

b.  .  =  I   means  that  node  n.  <  n.,  of  a  reduced  directed  qraph  G 
ij  j    i  a 

having  n  nodes;  a  vector  of  weights  w  =  (w  ,w  , . . . ,w  );  and  k 

are  given.  Then  create  k   lists  (L's)  and  associate  with  each  L 

i 

a  scalar,  t.   (i  =  l,2,...,k)  .  Also  create  a  vector  W  =  (W  ,W  ,...,W  ), 

a  vector  A  =  (6    .8~, . . . ,&    )  ,  and  a  vector  Y  =  (Y,,Y„,...,Y  )  . 
I   2      n  12      n 

1)  Set  t.  =  0  (i  =  l,2,...,k)  ,  C  =  the  critical  time  of  6,6=0 

for  i  =  l,2,...,n  ,  and  let  G'  =  G  .  Calculate  the  vector  Y  , 

where  Y.   is  the  largest  value  of  the  sum  of  nodes  on  all  paths 

between  node  n.   and  the  set  of  starting  nodes.  Starting  with 

level  I  of  G   and  proceeding  in  order  to  level   q  ,  if  n. 
R  i 

is  a  starting  node,  then  Y.  =  0;  otherwise, 

Y.  =  max{w.+Y.|n.  >  n.  ,  n.  e  G}  . 
i        J   J1  J    i    J 

2)  Find  min  (t  ,t  ,...,t  )   and  identify  the  p  <_  k  L's  which  sat- 
isfy t   =  t   =  ...  =  t   =  min  (t , ,t_, . . . ,t. )  .   Let  n. 

P|    P2         Pp         12k  j 

be  the  last  node  on  the  list  L   ;  for  each  non-empty  L    set 
column  j   i n  B  to  zero. 

3)  Identify  the  set  of  terminal  nodes  in  G'   as  Q  =  {q,,q9,...,q  }; 

l   ^      q 

i.e.,   n.  e  Q   if  b..=0  for  j  =  I ,2, . . . ,n.   If  Q  =  <j>  ,  then 

for  i  =  l,2,...,p  ,  set  t   =  min(t    ,t    ,...,t  )   and  go 

Pi        pp+l   Pp+2      Pk 
to  step  2. 
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4)  If  n   is  the  node  at  the  top  of   L    for  i  =  p+l ,p+2, . . . ,k 

J  pi 

(the  L's  not  satisfying  the  min  condition  in  step  2)  set  the  jth 

column  in  B  to  zero,  but  save  the  column  J  for  later  restor- 
ation. 

5)  Identify  the  set  of  terminal  nodes,  which  have  been  augmented  by 

step  4,  as  R  =  {r  ,r  ,...,r  }  .  Calculate  the  W    (i  =  l,2,...,r), 

i 
such  that 


W 
a  r. 


w   +  t    i  f   n   e  Q 
ri    P|       ri 


w   +  max{n,W.|n   >  n . ,  n.eG}  if  n  t   Q  . 


J1    r-j  J'      j 


r. 


6)  Let     M  =  max({   W     +Y      |r.    e  R})    ,    S   =   {nr    Ir.eR  and     V\L  +Y^     =  M}) 

ii  i  ii 

S'    =  sRq   ,      M  =  max(M,C)    ,and     d=M-C.      If     d>0,    then 

if  any  n.  e  Sf  was  previously  denied  assignment  to  G.   in 

step  8  such  that  0  <  6.  <  d  ,  then  select  a  node 

n  e  {n.ln  e  S'  and  (6-6.)    is  maximal}  ,  restore  the  state  of 
m     J   J  J 

G'  and  GA  at  the  time  of  the  n   assignment  denial,  and  go  to 
step  8  at  the  point  where  nm  is  pushed  into   L   ,  etc.  marked 

+  . 

7)  Set  C  =  M  .   If   (S  -  S')  =  <j>  ,  then  set  v  =  0  ;  otherwise  let 

u  =  min({W.-w.|n.  e  (S-S')})  ,   U  =  {n.|Wj-w.  =  u,  n.  e  G'}  , 
J   J   J  J   J   J       J 

X  =  {i|t.  <_   u,  i  =  l,2,...,k}  ,  and  set  v  =  max(0,p+|  u|- 1  X|  )  .   If 
p  <_  v  or  S '  =  <j)  then  go  to  step  8.  Otherwise,  whi  le  p  >  v 
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and   |S'|  >  0  ,  for  w.  ■=  max({W.  |  n.  eS' })  push  n.  Into  L  ,  set 

J  J       Pp 

S'  =  S'  -  {n.}  ,  W.  =  t   =  t   +  w.  ,   p  =  p-l  ,  and  elimin- 
J      J    Pp    Pp    J 

ate  the  jth_  row  from  B  ;  G'   represents  the  new  graph  with  n. 

removed  and  G   represents  the  new  graph  with  n.  assigned; 

A  J 

the  dependency  of  G.   from  G'  ,  i.e.,  G'  -*■  G.  ,  is  in  accordance 
with  the  i nways  of  n.   in  G  .  Go  to  step  9. 

8)  Let  M  =  max({  W   +  Y  In.  E  Q})  ,   S'  =  {n   |q.  e  Q  and 

a  q.     q.  |Mi    v  q. |Mi 

aW   +  Y   =  M}  ,  and  P  =  p  .   While  p  >  0  and   |S!|  >  0  , 

for  w.  =  max({w.|n.  e  S'})   set  S'  =  S'  -  {n.}  and  6  =  W.-u  ; 

if  6  <  0  then  ^ush   n.   into  L   ,  set  t   =  W.  ,   p  =  p- 1 

J         Pp        Pp    J 

and  eliminate  the  jth_  row  from  B  ;  otherwise,  if  6   <_  6  .     or 

6 .  =  0  then  set  6  .    =   6      and  save  the  state  of  G'  and  GR   for 
J  J  A 

possible  later  restoration.   If  p  =  P  then  set 

t   =  min(t    ,t    ,...,t  )   for  i  =  l,2,...,p  . 
Pi        Pp+I   Pp+2      Pk 

9)  For  all   b.  .  £  B  .  if  b..=0  then  go  to  step  10;  otherwise,  re- 

I J  i  J 

store  the  columns  set  to  zero  in  step  4  and  go  to  step  2. 

10)  Let  S  be  the  set  of  nodes  in  G'  .   Let  K=  {kin,  e  S}  and 

k 

k 
M  =  max{t.  |  i  =  l  ,2, . . .  ,k}  .   If  kM  -  £  t.  <  £  w.  ,  then  go  to  step  2. 

i=l  '    jeK  J 

Assiqn  the  remaininq  n.  e  S  to  Gn  ,  setting  t.  =  t.  +  w.  and 

j  A  '  i    !    J 

S  =  S  -  {n.}  as  the  assignment  is  made,  such  that  when  S  =  $ 

then  C(k)  =  max{t . ,t0, . . . ,t, }   is  minimized.  This  can  be 

12      k 
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achieved  using  Johnson's  algorithm  (13)  where  one  starts  with 

mi n{M-t. I i=l ,2, . . . ,k}  >  0  ,  and  looks  for  a  node  n.   such  that 
i  i 

w.  =  max{w.|j  e  K   or  a  set  of  such  nodes  Z  such  that    Y  w. 

n  .eZ 
J 

is  maximally  less  than  or  equal  to  min(M-t.)  . 

Figure  4.2  illustrates  how  algorithm  a  iteratively  maps  G  into 

G  .   Examine  Gv,  G.,  and  G   .   When  G^   is  analyzed  by  a  ,  it  is 
A  3'   4'       5  3         ii* 

discovered  that  the  critical  path  splits  at  the  top  of  G    ,  passing 

A, 3 

through  nodes  c  and  d  .   Since  neither  is  a  terminal  node  of  G'  , 

a  finds  that  node  h  terminates  the  longest  path  in  G'  .  However, 

since  two  machines  must  be  reserved  for  nodes  c  and  d  and  the  time 

available  is  only  10  units  while  w  =  II  ,  i.e.,   6.  =.l  ,   node  h 

is  not  assigned  to  G     at  this  time  and  the  second  machine  is  kept 

A, 3 

idle  10  units  instead.  The  analysis  of  G   determines  that  both  c  and 

4 

d  should  be  assiqned  to  G„  „  .  However,  the  analysis  of  G,_   indi- 
a        A, 4  r  5 

cates  that  the  critical  path  bypasses  G     for  d,-   =  12  units  of  time, 

due  to  the  node  h  .   Since  d^  >  6,  ,  algorithm  a  returns  to  G^  and 

5    h     3  3 

assigns  h  to  G    .  Thus  the  configuration  described  by  G .f  ,  is 
A, 3  4 

ach  ieved. 


4.4.   Computational  Complexity  of  Algorithm  a 

Bounds  on  the  computational  complexity  of  algorithm  a  are  difficult 

to  determine  unless  it  is  assumed  that  the  restoration  to  a  previous 

state  G  -*■   G   as  determined  by  steps  6,  7  and  8  never  occurs.   Let  us 
A 

assume  this  case.   During  any  iteration  of  algorithm  a  either  step  7  or 
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step  8  is  executed,  but  not  both.   Furthermore,  it  is  assumed  that  the 
basic  iteration  of  a  loops  from  step  9  to  step  2,  and  that  word  oper- 
ations may  be  used  to  determine  that  a  row  of  B  is  zero.  Table  4.1 
lists  the  number  of  operations  of  each  step. 

1)  0(n2/2) 

2)  0(k)  +  0(p)  £  0(2k) 

3)  0(n) 

4)  (k-p)O(n)  £  (k-l)O(n) 

5)  0(n)  +  0(r) 

6)  0(r)  +  0(qr)  £  0(r2  +  r) 

7)  0(p)<_0(k)      8)  0(q)  +  0(p)  <_  0(r)  +  0(k) 
9)  0(n) 

10)   0(r2) 

Table  4.1:   Number  of  Operations  of  Steps  in  Algorithm  a 

The  computational  complexity  of  steps  2  through  9  is  not  greater 

2 
than  0((2  +  k)n  +  3k  +  3r  +  r  )  .  An  upper  bound  on  the  overa I  I  number 

of  operations  in  a  ,  #(a)  ,  may  be  obtained  by  assuming  that  one  node 

is  assigned  during  each  iteration  of  a.  We  may  safely  assume  r<<n.   Since 

n 

J  i  =  n(n  +  l)/2  ,  (4.4) 

i  =  l 

then 
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2  2  2 

#(o)  <  0((2+k)n  /2  +   (l+7k/2+3r+r  )n)  =  0((2+k)n  12)    .        (4.5) 


On  the  other  hand,  a  lower  bound  on  #(a)  may  be  obtained  by  assuming 

that  k  nodes  are  assigned  during  each  iteration  of  a  .  Since 

n/k       2 
(k)  I      i  =  (n  +  kn)/2k  (4.6) 


then, 


#(a)  >  0((2+k)n2/2k  +  ( l+7k/2+3r+r2) n)  =  0((2+k)n  /2k)  .      (4.7) 


By  equations  (4.5)  and  (4.7),  the  following  lemma  holds: 
Lemma  4.5:   0(n2)  >_  #(o)  >_0(n2)/k  . 

The  assumptions  made  at  the  beginning  of  this  paragraph  are,  in 

fact,  not  unrealistic.   As  the  value  of   k  approaches   LUB(k), 

a 

d.  >  0   less  often  and  hence  fewer  restorations  to  a  previous  state 

2    2 
must  be  performed.   Steps  I  and  10  of  algorithm  a  require  0(n  /2+r  ) 

operations,  and  hence  the  bounds  given  by  Lemma  4.5  are  valid  for 
algorithm  a  from  step  I  through  step  10. 

4.5.   Least  Upper  Bound  on  the  Number  of  Machines  Required 

The  open  question  remains,  "Given  a  graph  G  with  critical  time 

C  ,  what, is  the  least  upper  bound  on  the  number  of  machines  required 

to  process  G  in  time  Cn?"   In  other  words,  what  is  the  lowest  value 

of  k  such  that  fi,(k)  =  C^?  The  following  lemma  is  used  to  estab- 

f      0 

I i  sh  the  bound  on  k  . 
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Lemma   4.6:      If       Y     d.    =  0    .    then      fiJk)    -  C 
.    ,       i  f 


i  =  l 

Therefore,  we  can  use  algorithm  a  to  find  a  least  upper  bound  on 
k  in  the  following  way: 

1)  Use  Theorem  4.1  to  find  a  lower  bound  on  k  .  That  is,  set 
k  =  m  . 

2)  Use  algorithm  a  ,  given  k  .   If  any  d  >  0   in  step  6  of  a, 

then  set  k  =  k  +  I   and  repeat  step  2.  Otherwise  STOP. 
Since  the  computational  complexity  of  algorithm  a  is  quite  ex- 
tensive, it's  i ndiscrimi nant  use  could  be  quite  costly.  However,  since 
Theorem  4.1  determines  a  lower  bound  for  k  which  is  not  smaller 
than  any  other  known  lower  bound,  the  number  of  times  algorithm  a  is 

required  to  find  k*  =  LUB(k)   is  small.   In  fact,  we  have  found  that 

a 

k*   is  determined  within  a  few  iterations.   For  example  see  Figures 
4.3  and  4.4  for  a  relaxed  graph  GR  mapped  onto  G.  with  k  =  5  . 
Furthermore,  if  algorithm  a  is  only  being  used  to  find  k*  ,  then  it 
is  never  necessary  to  restore  G.   to  some  previous  state  G.(j  <  i) 
and  hence  those  computations  in  step  8  where  6  >  0  need  not  be  per- 
formed. 

4.6.  A  Fast,  Non-optimal  Scheduling  Algorithm 

While  algorithm  a  is  optimal  for  k  machines,  it  is  an  iterative 

process,  terminating  when  all  nodes  have  been  assigned  to  G  ,  and 

A 

each  iteration  requires  extensive  computation.   It  is  desirable  to  have 
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Figure  4.3:      A  Relaxed  Graph,     GR   .      P./D.    <_5,      \  =  \,2,..,,5 
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Figure  4.4:   G   (k=5)   of  G   in  Figure  4.3,  by  Algorithm  a 
A  R 
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a  faster  algorithm  which  need  yield  results  that  are  only  slightly  less 
than  optimal.   It  turns  out  that  if  some  k?  ^_  k*   is  given,  then  the 
following  algorithm,  which  is  Schwartz's  algorithm,  serves  this  pur- 
pose. 

Algorithm  g:  Assume  that  an  n  x  n  connectivity  matrix  B  of  a  re- 
duced, directed  graph,  a  vector  of  weights  w  =  (w,,w0,...,w  )  and 

1   2      n 

k'   are  given.  Then  create  k'   lists  (L's)  and  associate  with  each 

list  a  scalar  t  ,  (  i  =  !,2,...,kf)  .  Also  create  a  vector 
i 

w  =  (w  w      W  )  >    where  W.  =  w.  +  max{W . I n .  <   n.}  . 

1)  Set  t.  =  0  ,  for   i  =  l,2,...,k'   and  calculate  the  vector  W  . 

2)  Find  the  mi  n(t  ,t„, . . .  ,t  )   and  identify  the  p  <_  k'  which 

I   Z        K 

satisfy  the  min  condition  as   p. ,pQ, . . . ,p   .   Let  n.  be  the 

1  l  P         J 

node  at  the  end  of   L   ;  for  each   p   ,  i  =  l,2,...,p  , 

Pi  j 

eliminate  the  row  in  B  which  corresponds  to  n   ,  and  W 

J        J 

from  W  .   If  B  is  empty,  then  STOP. 

3)  Let  M  =  max({W  |W  e  W})  ,  and  S  =  {n.lW.  =  M  and  the  ith  col- 

i   i  l  '  '  — 

umn  of  B  is  zero}  .   While  p  >  0  and   Isl  >  0  ,  for  w.  = 

J 

max({w  In.  e  S})   add  n.  to  list  p  ,  set  S  =  S-{n.}  ,  p  =  p- 1 , 
i   i  J         P  J 

and  W  =  0  .   If   Isl  =  0  ,  then  set  each  t 

J  Pi 

rninCt    ,t    ,...,t  )   for   i  =  l,2,...,p  .   Go  to  step  2. 
Pp+I   Pp+2      Pk 

Figure  4.5  illustrates  the  assignment  of  the  nodes  from  GR  of 

Figure  4.3  to  k*  =  LUB(k)   machines,  represented  by  G  ,  using 

a  B 
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algorithm  8.   Note  that  0(5)  =  31,  whereas  ft (5)  =  30  .  One  could  im- 
prove algorithm  8  slightly  by  looking  ahead  at  the  time  each  set  of 
starting  nodes  is  assigned  to  check  for  a  split  in  the  critical  path; 
however,  then  the  computational  complexity  would  be  increased  beyond 

that  of  algorithm  a. 

2 

Algorithm  8  is  still  0(n  )  ,  even  though  we  call  it  a  "fast" 

2 
algorithm.  The  three  steps  of  8  require  0(n  /2)  ,  0(2k)  ,  and 

0(n+k)   computer  operations  respectively.  Using  the  same  reasoning  as 

for  algorithm  a,  we  find  that 

4 

0(n2/2)  >  #(8)  >  0(n2/2k) 
for  steps  2  and  3.  However,  step  I  requires  as  many  operations  as  the 
upper  bound  of  #(8)  ,  so  in  this  case 

#(8)  =  0(n2) 
no  matter  how  many  nodes  are  assigned  each  iteration  of  8  .  However, 
because  of  the  coefficients  in  these  computations,  #(a)  -  2  *  M&)    , 
under  the  best  circumstances  and  #(a)    -  ((3+k)/2).#(0)  ,  under  the 
worst  circumstances. 


4.7.   Multiple  Special  Purpose  Machines 

Suppose  a  job  described  by  G  consists  of  tasks  (nodes)  which  re- 
quire the  use  of  one  of  m  special  purpose  machines.  Let  there  be 
k  ,k?,...,k   of  these  machines  respectively  to  perform  the  work.   Is 
it  possible  to  assign  the  tasks  to  the  machines  such  that  the  scheduling 
is  optimal?  Yes;  an  algorithm  similar  to  a  would  achieve  this  goal. 


It  is  necessary  to  maintain  m  sets  of  machines  (L's)  and  at  each 

iteration  of  a  ,  the  number  of  available  machines  of  each  type  is 

determined  as   p,  p,...,  p  respectively.  At  step  7,  where  it  is 
I   2      m 

determined  that  the  critical  path  splits  into  v  paths,  it  is  nec- 
essary to  break  this  down  into  ,v,.v,...,  v  paths  for  each  machine 

I   2      m 

type.  Then,  in  accordance  with  the  strategies  of  steps  7  and  8,  node 
.n.  ,  where  the  prescript   .  denotes  that  the  node  is  type  J  ,  is 

J  J 

added  to  the  end  of  L 

JP.P 
J 

It  is  probably  easier  to  understand  the  changes  outlined  in  the 
paragraph  above  by  an  example  rather  than  rigorously  modifying  algo- 
rithm a  .   Suppose  a  job  described  by  the  graph  in  Figure  4.6  is  given, 
and  suppose  that  two  different  kinds  of  machines  are  required: 
an  arithmetic  unit  (AU)  and  a  memory  unit  (MU).   Let  each  starting 
node  be  a  "FETCH"  operation  requiring  the  use  of  a  memory  unit  and 
each  terminal  node  be  a  "STORE"  operation  also  requiring  a  memory  unit. 
Let  all  other  nodes  be  operations  requiring  the  use  of  an  arithmetic 
unit. 

Figure  4.6  is  a  relaxed  graph  of  the  FORTRAN  program 
I  NT  I  =  A-C 
INT2  =  E-A*B 
INT3  =  F-C*D 

Q  =  ( 1NT3- INT2)/I NT  I 

R  =  (A*INT3-C*INT2)/INTI 
INT4  =  P*(Q-D)+K*L+M*N 


92 


A         B  CD 


Figure  4.6:      G       of   a   FORTRAN  Program 
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INT5  =  0*(Q-B)+G*H+l*J 
INT6  =  H*L-J*N 

S  =  ( INT4*N-INT5*J)/INT6 

T  =  (INT5*L-INT4*H)/INT6 
END  . 

We  assume  that  the  intermediate  variables   I  NT  I , . . . , I NT6  do  not  re- 
quire use  of  the  MU  .   Let  the  fetch-store  weight  w  =  2  ,  the  add- 

subtract  weight  w  =  2  ,  the  multiply  weight  w  =  3  ,  and  the  divide 
a  m 

weight  w  =  5  .  The  critical  path  is  indicated  by  bold  connecting 
arcs,  and  the  critical  time  is  33.   Since  the  maximum  width  of  the 
critical  path  of  AU-type  nodes  In  4  and  of  MU-type  nodes  is  4  also, 
let  us  assume  that  there  are  four  All's  and  four  MU's  available  to  pro- 
cess this  program. 

Table  4.2  is  a  tableau  of  algorithm  a  on  the  graph  from  Figure 
4.6.  The  order  of  the  nodes  listed  under  Q  ,  the  set  of  terminal 

nodes  of  G|  ,  and  the  order  of  the  numbers  listed  under  Y  and   W  , 
1  a 

the  longest  path  value  to  the  set  of  starting  nodes  and  terminal  nodes 
respectively,  are  the  same.  The  column  marked  t  is  the  minimum 
height  (in  terms  of  time)  of  the  AU's  and  MU's.   Neither  the  machine 
delays  which  are  inherent  in  algorithm  a  ,  nor  the  machine  availability 
(i.e.,   p  and   p)  are  indicated  in  Table  4.2..   It  is,  however,  easy 
to  observe  these  facets  by  examining  Figure  4.7  while  following 
the  tableau  in  Table  4.2. 

Figure  4.7  indicates  the  node  assignments  to  the  AU's  and  MU's  by 
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Figure  4.7:   Node  Assignments  to  the  AU  and  MU 
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algorithm  a  .  The  numbers  in  parenthesis  indicate  the  length  of  the 

time-slot,  and  the  literal  indicates  the  node  which  has  been  assigned 

to  that  time-slot.  A  A-node  symbolizes  an  idle  condition,  i.e.,  a 

period  of  time  when  no  useful  work  is  performed  by  the  machine.   Note 

that  for  all  nodes  n;  on  the  critical  path  W.  =  W.   and  hence 

1  i   a  i 

four  AU's  and  four  Mil's  are  sufficient  to  process  the  graph  in 
critical  time  33. 
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APPENDIX 


TREE-HEIGHT  REDUCTION  PROGRAM 
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I*    HTEXP  */ 


STMT  LEVEL  NFST 


2 
"? 

*\ 
6 
7 
B 
9 

to 
11 
I? 

13 
I* 
15 
16 
17 
10 
19 
20 
71 
22 
23 

24 
29 
31 
32 
33 
34 
35 


36 


37 


38 

2 

39 

3 

40 

3 

/•  HTEXP  */ 

(SUBSCRIPTRANGEI: 
HTEXP:PPOC: 
OCL  STRNG(200I  FIXED  BIN«151  EXTERNAL; 

DCL(0USTNG,INSTNG)(2C0>  BIN  FIXED  (l5)_EXTj 

DCL  ZERO   BIN  FIXED! 15) INITIAL  (0001  EXTERNAL;" 
DCL  OPAND  BIN  F  I  XED(  15  UNIT  I  AL  (5001  EXTERNAL; 
OCL  EQU    BIN  FIXED! 15) INITIAL  !501)  EXTERNAL; 
DCL  SEC    BIN  FIXED!15» INITIAL  <502)  EXTERNAL; 
OCL  LFFTP  BIN  F  I  XFD!  1 5) INIT I AL  <503)  EXTERNAL; 
OCL  RIGHP  BIN  FIXED!  15HNITIAL  (5041  EXTERNAL; 
DCL  PLUS   BIN  FIXEDI15) INITIAL  (505)  EXTERNAL:" 
DCL  MINUS  BIN  FIXFDJ 151  INITIAL  (5061  FXTERNAH 
OCL  MULT   BIN  F I XED! 15 ) INIT I AL  1507)  EXTERNAL; 
DCL  DIV    BIN  FIXEDI  151  INITIAL  (508)  FXTFRNAL; 
OCL  COMMA  BIN  FI XFO( 15 ) INIT I AL  (509)  EXTERNAL; 
OCL  EXPO   BIN  FIXFDI 15) INITIAL  (510)  EXTERNAL; 
DCL  (SfMHTI  BIN  FIXE0U5I  EXTERNAL; 
OCL  ADR    BIN  FIXED(15)INITIAL  (511)  EXTERNAL; 
DCL  (BKR(1W(30),BKCOL(30)  »  BIN  FIXE0(15); 
CCL  (SC,AD,SU,MU,DVI  BIN  FIXE0U5I; 
DCL  WGHT(0:4)  *IN  FIXE0I15)  INITIAL  12,2.3.5,1); 
DCL  WO  BIN  FIXEDI15I : 
DCL  W2  BIN  FIXE0I15) ; 
DCL  ARRAVIO: 19,0:1591  BIN  FIXEDM5)  EXTERNAL, 

EXP   (0U9, 0:159)  BIN  FIXE0I15)  DEFINED  ARRAY; 
SC-SFC-EOU;  AD-PLUS-EOU;  SU-MINUS-EQU;  MU-MULT-EQU:  DV-DIV-EOU; 
W0>WGHT(0);         W?«WGHT(2I; 
ARRAY-O; 
DCL  TABLESI20)  FIXED  BINI15)  EXT  INIT((20)0); 
TABLES-C; 
BEGIN; 
DECLARE  (  I.J.JP)  BIN  FIXE0I15), 

HAN  BIN  FIXFDI15)  DEFINED  S; 
DFCLARF  COPVLINF 
COPYGRPS 
CRFATFGP 
AODGRPS 
7GRP 
FVAL 
FORM 
MOCMP 
FX2 

CSGRPHT 
DECLARE  B19    BIN 

B20    BIN  FIXFD!15)  INITIAL(20I, 
BPT    BIN  FIXE0I15)  INITI AL( 157) , 
LIM    BIN  FIXE0I15)  INIT(AL(159); 
rr)PYLINE:PROCFDURE  (I.IP.J.JP); 

DECLARE  (I,IP,J,JP,K)  BIN  FIXE0U5) 
DO  K«0B  BY  IB  TO  EXP(1,0B)-J; 


FNTRV(BIN  FIXED! 15), 8IN  FIXEpil5I.BIN  FI XED( 15) .BIN  FlxeD(l5)> 


ENTRV(BIN  FIXE0(15),BIN 
ENTRVIBIN  FIXFD( 151 .BIN 
ENTRYIBIN  FIXED(  15) .BIN 
FNTRVIBIN  FIXE0(15),BIN 
FNTRYIBIN  F I XED< 15) • BIN 
ENTRY(B!N  FIXEDd5I.BIN 
ENTRYIBIN  F I XFD( 15) .LABEL ) , 
ENTRY(BIN  FIXEDI15II  RETURNS 
ENTRY(BIN  FIXEDU5I)  RETURNS 
F1XED(15)  INITIAL(19), 


FIXE0(15),BIN  FIXED(15),BIN  FIXEOdBI) 
FIXEDd5I.BIN  FIXE0I15I.BIN  FIXE0(15I> 
FIXED!  15), BIN  F  I  XFDI  15)  ,  BIN  FIXEDI15M 
FIXE0I15),BIN  FIXE0I15I.BIN  FIXFD<15)> 
FIXE0(15))RETURNS(BIN  FIXED! 151), 
FIXE0I15)), 


(BIN  FIXEDI15JI, 
(BIN  FIXE0(15)); 
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/•    HTEXP    •/ 


STMT    LEVFL    NEST 


41 

3 

1 

43 

3 

44 

3 

45 

3 

46 

2 

47 

3 

46 

3 

49 

3 

50 

3 

51 

3 

53 

3 

1 

54 

3 

1 

55 

3 

5ft 

3 

1 

57 

3 

1 

5<» 

3 

60 

7 

61 

3 

6? 

3 

63 

■x 

64 

3 

65 

3 

1 

67 

3 

6S 

3 

6«» 

3 

70 

2 

71 

3 

72 

3 

73 

3 

74 

3 

1 

76 

3 

77 

3 

78 

3 

7<» 

? 

BO 

3 

81 

3 

82 

3 

83 

3 

84 

3 

1 

86 

3 

87 

3 

88 

2 

8<J 

3 

EXP(  IP,K«-JPI»EXPU,K*J)  J    ENOI 
FXPI IP,0BI-JP*K-1B| 
FXPIIP.LIMI-OBJ 
END    CCtPYLINE;  _        

CDPYGRPS:PRnCEDURF    U.IP.J.JP);  /*J,JP    BY    NAME*/ 

OECLARE    (I.IP.J.JP)     R1N    FIXE0I15); 
EXP(!P,JPI-EXP(1  ,J); 
FXPI IP,JP*1BI-EXPII, J*1B) J 
JP"JP*10R;  

IF    EXPI  I  ,J*1BK0B    THEN    DOl 
J-JMOB; 
END; 

FLSE    DO    J-J*10B    Br    IB   TO    EXPI I , J*1BI ♦J*10Bl 
EXPIIP,JP)-EXP| I, J) J 
jp-jp*1B;   fnd; 
fno  copvgrps; 

CRFATEGP:PROCFOURE    ( I ,PT, TYPE tHT >;  /*T YPE-WGHTIO I  1  I  2  I  3) JPT-LOC    ISTGP«( 

DECLARE    I  I ,PT,TYPEtHT,K,GR>     BIN    FIXE0I15); 
EXP(  I  .PT-1RI-TVPE: 

gp«modiht,type» ; 
do  k-ob  hy  ib  to  type-ib; 
fxpj i,k*pti«cb;END; 
fxpi i ,k*pt»«ht; 

EXP(  I,GR*PT)-EX2(HT/TYPEI; 

END   CRE»TEGP; 

AODGRPSt    PROCEDURE    II.IP.J.PTI;  •  /«J    BY    NAHE»/ 

DECLARE    (ItlPtJtPTfKI    BIN    FIXE0I15); 

J«J*10B; 

00   K«OB    TO    FXPI IP.PT-1B1-1B; 

EXPI IP,K*PT>«EXP( IP,K*PT1*EXPII ,K*J»I  END; 

FXP( IP,K*PT)--1B? 

J-J*K*19;  /*J-1ST  LOC  AFTER  OLD  STRING*/ 

FNO  ADDGRPS: 

ZGRP:     PROCEOURE  I  I , PT, OP, WGHT >  ; 

DECLARE  « I,PT,OP,WGHT,KI  Bl N  F IXEDI 1 5  I ; 

FXPI  I,PT-10B)«OP; 

EXPI I  ,PT-1BI»WGHT; 

DO  K*PT  TD  PT*WGHT-IB; 

EXPII.KI-OB;    END; 

EXPI I  ,KI«-10B; 
ENO  ZGRP; 

EVAL:     PROCEOURE  (I.JI  RETURNS  IBIN  F1XEDI15I); 

DECLARE  II ,J,M,FIN,CLS,K,K1,TMP<0:9) ,CNTI  B IN  F  I  XEDI 15 t I 


'•  HTEXP  ♦/ 
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STMT  LFVI 

90 

91 

93 

94 

9* 

97 

9B 

99 
100 
101 
103 
104 
105 
106 
107 
109 
110 
111 
11? 

n* 

116 
117 

IIP 

119 

120 


t    NEST 


IB; 


CNT-EXPI I.EXPII,  J-1BKJU 
IF   CNT>-06   THEN   RETURN(CNT); 
00   K-OB    BY    IB    TO   EXP(I,J-1BJ- 

TMP<K»-EXP(I,K*J11    _iNOi     

CLS-OB; 
CNT«0B; 
H«IB; 
FVALU         FIN-OB; 

DO  K-OB  BY  IB  TO  EXP( I,J-1B)-IB; 
IF  CLS-OB  THEN  001 
CLS-TMPIKItM; 

ki»k; 
tmpiki-tmpu1-cls: 

ENOl 

ELSE    IF     (TNP(K)£MI>OB    THEN    OOj 

TMP(K1»TMP(K»*CLS; 

CLS-OB; 

ENO; 
F  ik»f in*tmp<k I :  ENO; 

IF    FIN-OB    THEN    RETURN    (EXPC I , J- IB)*CNT*K1 » i 
M>M»M{ 
CNT-CNT»1B; 
CLS-CLS*CLS; 
GO    TO    EVALl; 

fno  eval; 


121 
12? 
123 
124 
125 
126 
127 
129 
130 
131 
132 
133 
134 
135 
136 
137 
138 
139 
140 
141 
14? 
143 


FORM:     PROCEDURE  {TYPE, II: 

OECLARE  <I,J,JP,TYPE,K)  BIN  FIXE0I15); 

PUT  EDIT!!,'  FORM", TYPE) (F{ 3  I , »( 51 ,F( 2  J) SKIP; 

CALL  COPYLINE(I,B19,0B,OBU 

J-IB; 

jp-Ib; 

F0RM1:    IF  EXP(B19,JI»SC  THEN  00  5 
FXP(I,JP»-SC; 
exp< l.OBI-JP; 

PUT  FOIT  ((EXP(l.K)  00  K-0  BY  1  TO  EXP ( I ,0B) I M I 160  )F( 5» >SK I Pj 
RETURN; 


ENO; 
ELSE  IF 


I 


EXPCB19,  J*lB»<OB  THEN 


EXP(819,J*IB)«TYPE 

CALL  C0PYGRPS(B19,I,J,JP»J 

ELSE  DO; 

EXPII.JP)»EXP(B19,JI| 

CALL  CREATEGP(I,JP*l0B,TVPE,FVAL(Bl9,J*10B»lt 

JP«JP*11B»TYPE; 

J-J*UB*FXPIB19,J*1BI; 

ENO; 


GO  TO  F0RM1; 


ENO  FORM; 
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/*    HTEXP    •/ 

STMT    LEVFL    NEST 

144  2  MOCMP:         PROCEDURE    (IN, ERROR)    RECURSIVE! 

Ih1,  3  OECLARE    lt,J,JP,MP,Hf  JJ,JJP,NDIS,M, FIN, K, DP, TL0CI0I9I, TERMS, POINT, C, 

N,HT,GR,0VPT,TMP|0:9»,IN»    BIN    fUFOHM, 
0PI15I    LABEL     INITIAL     I SHC .ERR ,ERR . ADO, AOO. MUL .01 Vf EBU.ERR. 
ERR,  ERR, ERR, ERR, ERR, ERR  I, 
ERROR    LABEL  I 
146  3  DECLARE    COPYGRPT    ENTRYIBIN    FIXED! 15 ) ,B IN    FIXEDU5t.BIN    FIXEDI15), 

BIN    F  IXEDU51  I  . 
HHOLEOIS    ENTRY    (BIN    F I XEOt IS  I , L ABEL • L ABELI , 

FINOBIT    ENTRY    ( BIN    F IXEQI 15 > ,8  IN    F j XEOI 1 51 , 8  IN    FIXE0I15II, 
OVFVAL    ENTRY    (BIN    FlXEDI15)li 
1*7  3  COPYGRPT:PROCEDURE    II.IP.J.JPl!  /*J,JP    BY    NAME*/ 

148  4  OECLARE     (I.IP.J.JP)    BIN    FIXEOI15U 

149  4  FXP!  IP,JP)-FXPIl,JI  J 

150  4  CXP<  IP,JP»1B)-EXP(I,J*IB» J 

151  4  JP-JP*10Bl 

152  4  IF    FXPI  I  rJ+lBKOB    THEN    00; 

154  4  1  J»J*10B: 

155  4  I  TFRMS-TFRMS*1B; 

156  4  1  TLOC(TERMSI»JP-lB; 

157  4  1  END; 

158  4  ELSE  DO  J»J*10B  BY  IB  TO  EXPI I , J*l B) ♦J*10B t 

159  4  1  EXP(IP,JP)«EXP!I ,JI  : 

160  4  1  JP«JP*1B;     END; 
162  4                      END   COPYGRPT? 


163  3  MH0LEDIS:PROCEDURE    (II.L1.L2)     RECURSIVE; 

164  4  nECLARE    NDISFLG    BIT! 11} 

165  4  OECLARE    I1I.CLSI     BIN    FIXE0I15H 

166  4  OECLARE    (L1.L2I    LABEL! 

167  4  CALL    M0CMP(U,FRR0R»1 

168  4  NDIS-BPT-W0-1B! 

169  4  FXPt II,LIM|«CB! 

170  4  FXP(  I  I,LIM-lBI«OBi 

171  4  EXP< 1,MP*W2)«-1B; 
17?  4                                             H-U 

173  4  TFRMl:  FIK-O; 

174  4  DC    K*OB    BY    IB    TO    H2-1B; 

175  4  1  CLS«EXP(I,K>MPICM; 

176  4  1  EXPU ,K*MP)«EXP(I ,K*MP1-CLS; 

177  4  1  FIN-FIN+EXPI I.K+MPIi 
17B  4  1  IF    CLS>0    THEN    00! 

180  4  2  NOISFLG-IBi 

181  4  2  JJ*1B; 

182  4  2  JJP-1R; 

1«3  4  2  CALL    ZGRP(II,NDIS,EXP( II, II, WO); 

184  4  2         OISTRIB:  IF    EXP! I  I  , J J >«SC    THEN    IF    NDISFLG   THEN   00;  /»IE    NEVER    OISTRIB*/ 

187  4  3  DO    JJ-K*11B    BY    W2*11B    TO    EXPI II, OBI; 

188  4  4  FXPU  I,JJ)-EXP!II,JJI-CLS!  END! 
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/*  HTFXP  */ 
STMT  LFVEL  KEST 


19C 

4 

3 

191 

4 

3 

192 

4 

3 

19? 

4 

.  3 

19? 

4 

3 

196 

4 

3 

197 

4 

3 

198 

4 

3 

199 

4 

2 

20C 

4 

3 

201 

4 

3 

20? 

4 

3 

203 

4 

3 

204 

4 

3 

205 

4 

3 

206 

4 

3 

207 

4 

3 

?CR 

4 

1 

209 

4 

3    DISTRIB1 

210 

4 

3 

211 

4 

3 

212 

4 

3 

213 

4 

2 

214 

4 

2 

215 

4 

2 

216 

4 

? 

21« 

4 

3 

219 

4 

3 

220 

4 

3 

221 

4 

2 

22? 

4 

3 

223 

4 

3 

22* 

4 

3 

226 

4 

3 

227 

4 

3 

22R 

4 

3 

229 

4 

2 

210 

i 

2 

?32 

4 

TFRMEND: 

234 

4 

1 

235 

4 

1 

716 

4 

1 

239 

4 

7 

24C 

4 

2 

241 

4 

? 

2*2 

4 

1 

243 

4 

1 

244 

4 

1 

245 

4 

EXP(I,K*MP)«CLS*EXP(I,K>MP); 

FXPUI.HMI-EVALI  ll.NOIS); 

FXP(II,LIM-lB»«lBl 

IF    TERHS«1BCDP>0B   THEN   GO   TO   LU_ 

r.fUMriO(EXPm,L!M>,W2M 

EXP<I,GR*MP>«EXP<l,GR*MPl*EX2IEXPUI,LIM»/W2>» 

RETURN; 

END! 

ELSE    DO; 

IF    EXPU!,BPT-1BJ  —  10S    THEN 

GO    TO   OISTPIBl;     "  /*IE    ALWAYS    OISTRIB*/ 

FXP(II,JJP»-EXPUtfN0IS-21;  /"SIGN*/ 

JJP-JJP*2; 

CALL    CREATEGPnitJJPtW2,EVAL<  U.NDISI  I  ;  /*m-F0RM*/ 

EXP(II,K»JJP1-EXP(II,K*JJP»*CLS; 

JJP»JJP*lB*M2j 

FXPU  I.JJP-IBI  — IB; 

FXPH  I  ,Lf  M-1RJ-10R) 

EXPIIl.JJPI-SC; 

FXPU ttOBI-JJP; 

GO    TO    TERMEND; 

END; 
HT-EVALJ  II,JJ*2U 

EXP( II,K*JJ*2I-EXP< II,K*JJ»2»*CLS; 
FXP( II, JJ*10B*W2>  — IB; 
IF    EVALI  II,JJ*10BKHT*W2    THEN    00; 

CALL    COPYGRPSIII.II.JJ.JJP);  /*DI STR I BUTEO*/ 

NDISFLG-OB; 

FND; 

ELSE    DO; 

GR«M0D1HT,W0I  ;  /•   NO   OISTRIB*/ 

FXP(U,GR*NDIS)-EXP(  1 1  ,GR*NDI  S  )  *EX2  (  HT/WO)  ; 

IF    EXPIII,  JJI—FXP1  It,NDIS-2)    THEN    EXP  III ,NDl S-21-A0; 

EXPUI.BPT-lBl  —  IB; 

jj«jj*1 1H*W2; 

FND; 
GO    TO    OISTRIB: 
END;  END; 

IF    F1N«0    THEN    DO; 

FXP( !,MP-10BI«EXPII ,MP-10B)*B?0; 

MP--1B; 

IF    TERMS-IP.    THEN    IF    0P<0B    THEN    DO; 

CALL    COPYLINEI I  I. I.1B.EXPI  I.OBII ; 
RFTURN; 
END: 

ELSF    GO   TO   LI; 
GO    TO    L2; 
END; 
M«M*M; 


04 


/•    NTEXP    •/ 


STPT    LFVEL     NFST 


2*6 
2*7 


Gn    TO    TERH1; 
END    MHOLEOISl 


2*8  3                      FINDBIT:    PROC EDUREI CLSf K,  CNT  I  }        __          

2*9  *                                              OECLARF     (CLS.K.CNTI    BIN    FIX 10(15)1 

?<50  *                                             CNT-OBt 

251  *  AGAIN!    FIN-OB; 

252  *  DO  K-OB  BY  IB  TO  W2-IB; 

253  *  I  CLS-TMPIKICM; 

25*  *  1              IF  f.LS>OB  THEN  00  J 

256  *  2                    TMPIKI-TMPIK1-M; 

25?  *  2                  RETURN; 

258  *  2                  END; 

250  *  1             FIN«FIN4THP(KI;      END; 

261  *                   IF  FIN-OB  THEN  RFTURN; 

263  *                  CNT-CNT*IB; 

26*  *                  M-M*M; 

265  *  GO  TO  AGAIN; 

266  *  FNO    FINCBIT; 


267 
268 

260 
270 
271 
272 
27* 
276 
277 
278 
270 
2B2 


DVEVAL 


FTNOHT: 


IF 

END    FINtHT; 


PROCEDURE(DH»    RECURSIVE; 
OECLARF    IDH,GR,NH(0:2l,CtS{0:2,0:2M    B  IN   F  IXEDt 151 , 

FINDHT    ENTRYIBIN    FJ  XFDU  5)  ,  L  ABEL  I  ; 
PROCEDUREtN.L J; 

OFCLARF    N   BIN    FIXEHIIS),    I    LABEL; 
CALL    FIN0eiT(CLS(N,0l,CLS(N,lt,CLS(N,2)l; 
IF    CLSIN,0)-OB    THEN    DO; 

IF    TFRHS>0B    THEN    GO    T(5    Lj 
NHIN)«DH*1«S 
END; 

FLSF    NHIN»-W2*CLS(N,2»*CLSIN,1); 
TER"S>OB    THEN    IF    NH(N I >-EXP( I  I, L I«l    THEN    GO    TO   Li 


/•JAK  NEXT  S  •/ 


2B3 
2B5 
286 
287 
288 
200 
201 
202 
203 
20* 
205 
296 
297 
298 
300 


IF  TERHS>OB  THEN  00; 

ii-abs(exp(  i.Tincmn ; 

HT-CSCRPHTUI1; 
FND; 
IF    MP<0B    THEN    DO; 
DIVDIST:  /*DIVOIST*/; 

CALL   CREATEGPUTtlOBtW2>EXP(1I(lfM|*WGHT(3l  It 
FXP(IliO>»W2*100Bt 
EXP( II.FXPI II.OII-SC: 
RETURN; 
FND; 
GR«H0D(DH,W2 JJ 
ITFRATE:    00    K-OB    TO    U2-1B; 

TMP(K»-EXP(I,K»MP1;     END; 
PUT    OATAITMPl; 


/*  HTFXP  •/ 


STMT  LEVFL  NFST 
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301 
302 
303 
304 
307 
309 
310 
311 
312 
313 
314 
315 
316 
317 
319 
320 
321 
322 
323 
324 
325 
32ft 
327 
320 
330 
331 
332 
333 
335 
336 
337 
338 
339 
340 
343 
344 
345 
346 
347 
350 
351 


DVEVAL1 : 


CASFA: 


CASE*: 


DVEVAL2 : 


CASFC: 


0VEVAL3 
END    DVEV 


M*  IB  * 

call'findbiticls(o,oi,clsio.ii,clsio,2)I; 

NHIO|=w2»CLSI0,2»*CLSI0,ll; 

IF    TFRMS>OB    THEN    IF    NH101  >-EXPI  I  I  .H  HI    THEN   00; 

IF    DH<EXP(II,lTm>    THEN   GO~TO    DIVDIStV'' 

K-MOOIEXPI  ILL  IM»  ,W2>  ; 
PUT    LIST!  'OVEVALIM; 

EXPIl,K+MPt-EXPII,K*MP)»EX2(EXPIIl,LlN)/W2U 

EXPI  I.TLOCIU  »— IB; 

EXP(  I,  TL0CIl)-l»«EXP<I,TL0Cm-l)»B20l 

terms-ob; 

go  to  iterate; 

END; 
IF    0H<NH<0)     THEN    DO; 

FXPI I,MP*CLSI0,1)I»EXP(I,HP*CLS(0,1I)-CLS(0,0»; 

NHI0J»NHI0I*WGHTI3» : 

K»M00(NHI0),W2) ; 

EXPI I,MP*K1=EXPII,HP*IO*EX2INH<0)/M2J; 

EXPI I,MP*W2I»-1B; 

RETURN; 

END; 
CAUL     F1NDHTI1.DVEVALII; 
IF    DH<NHIl)     THEN    DO: 

EXPI  I,MP*rLSIO,ll »«EXPII ,MP»Ct St 0,11 l-CLS (0,01 S 

NHI0)«0H*WGHTI31 ; 

GO  TO  f.ASFA; 

FNO; 
NH(ll*W2<-DH  THEN  DO; 

EXPI  I,HP*CLSI0,lll«EXPU,MP*ClSI0,in-CLSI0,0»; 

EXPI I,MP*CLSIl,ll l-FXPI I,MP*CLSI1,1)I*CLSII,0»; 

GO  TO  ITERATE; 

END; 

findht12.dveval3i; 
0h<nh(2i   then  if  nhi  2  knhi  1  i +wght  i  3  )  then  do; 

fxpi  1,mp»clsi1,1)i»fxpii,hp*clsi1,1m-clsi1,0)  ; 

go  to  caseb: 

end; 
to  dveval2; 

dh<expt ii.limi  then  if  exp  i  i  i  ,l im knhi  1 1 +wght (3)   then  go  to  casec; 
to  dveval2; 


IF 


CAIL 
IF 


GO 
IF 
GO 

Al; 


352 
353 
354 


/•BFGIN  MOCMP  STMTS*/ 

I-IN: 

CALL  FGRM(W2  ,1); 

TEPMS^OR; 
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/•     HTEXP     •/ 


STMT    LFVFl     NFST 


355 

3 

356 

3 

357 

3 

358 

3 

MDCMP 

350 

3 

361 

3 

363 

3 

3  64 

3 

367 

3 

368 

3 

36<5 

3 

37P 

3 

ahd: 

372 

3 

1    AODl: 

373 

3 

374 

3 

375 

3 

376 

3 

378 

3 

381 

3 

382 

3 

384 

3 

385 

3 

388 

3 

389 

3 

300 

3 

391 

3 

30? 

3 

303 

3 

304 

3 

305 

3 

306 

3 

307 

3 

398 

3 

1    ELIMT 

300 

3 

401 

3 

401 

3 

402 

3 

40  3 

3 

405 

3 

406 

3 

407 

3 

CKDV: 

409 

3 

410 

3 

I    ORDER 

411 

3 

413 

3 

416 

3 

416 

3 

419 

3 

421 

3 

I    CK0V2 

/•OLT   NULL    TERM    OR    CROUPS*/ 


/•ADO    OR    SUB»/ 


j-in; 

jp-lft; 

OP— IB; 

point«expi i, j);  

IF  P0INT<1B  THEN  GO  TO  ERR» 

IF  POINT<1001B  THEN  GO  TO  OPIPOINT); 

JJ-J»11B*FXP( 1,J*1B> ; 

IF    P0INT<MU*B20    THEN    IF    EXP (  I , J J  )>B20    THEN    FXPI I  ,  J J  I «P0 INT  J 

ELSE    FXPI! ,JJt-POINT-B20; 
J«JJ;  _    _      

GO    TO    fOCMPl; 

IF    TERMS>OB    THEN    DOj 

II-ABS(EXP(I.TLOC(TERHSI)l; 
HT-CSGRPHTIII l| 

00  K-TERMS-1B    BY    -IB    TO    IB; 
II-ABS(EXP( t.TLOCIKIII; 

IF    CSGRPHTIMUHT    THEN   00; 

N-TLOC(TERMSI;    TLOC ( TERNS (-TLOC IK  1 ;    TLOCIKI-NJ 

HT-EXPIII,LIM); 

END;    FNO; 

1  I ■ ABS I  EXP  I  I . TLOC I  TERMS ) ) I ; 

IF    MP<OB    THEN    IF    TERMSMB    THEN    DO; 
CALL    C0PVLINEU,B19,J,JJ; 
EXP(  I.JPI-MUl 
NP»JP*10B; 

CALL    CRFATEGPI I , MP.W2 , EXPI I  I tLI Ml  I ; 
JP«JP»11B*W2: 

CALL    COPVLlflEIBlO,  I,  J.JPIj 
GO    TO    EL  I  NT  J 
ENOt 

ELSE    GO    TO    CKOV; 
CALL    MHOLEDISUI,CKOV,AOOI; 

EX  PI  I, TLOC I  TERMS I-IBI- EXP  I  I  ,TLOC< TERMS I-1B )»B20; 
FXPtI,  TLOCITERMSI  I— IB; 

PUT    FDITII,  lEXPdtKI    00    K-OB    TO    E XP t  I , OBI  I  H < 160) FI5 ) I SK I P J 
EXPIIItLIM-10B)-I; 
TFRMS-TFRMS-1B; 
IF    TERMS>0B    THEN    GO   TO    AODll 
GO   TO    M0CMP2; 
FND; 
IF    DP>OB    THEN   CKDVl:    00; 
N=EXPIDPtO); 

00    K»W2*111B    BY    108    TO    N5 

IF  CSGRPHTI ABS(EXP(DP,K) I )>CSGRPHTI ABS(EXPI0P,K-19B1)»  THEN  00; 
M«EXPII,K);  EXPH ,K)-FXP( I .K-10RI ;  EXP ( I.K-108I-M; 
END  oroer; 
N»N-10B; 

IF  N>W2*110B  THEN  GO  TO  ORDER; 
IF  FXPIDPtO  I«100B«-W2  THEN  GO  TO  CK0V3; 


/*  HTEXP  */ 


STHT  LEVFL  KEST 
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423 

3 

1 

425 

3 

2 

426 

3 

2 

42  7 

3  ' 

2 

42« 

3 

2 

429 

3 

1 

431 

3 

2 

432 

3 

2 

434 

3 

2 

435 

3 

3 

436 

3 

4 

439 

3 

3 

43<J 

3 

3 

440 

3 

3 

441 

3 

3 

44  3 

3 

3 

444 

3 

3 

446 

3 

4 

447 

3 

4 

448 

3 

4 

449 

3 

4 

450 

3 

4 

451 

3 

4 

452 

3 

4 

45  3 

3 

3 

454 

3 

3 

455 

3 

3 

456 

3 

3 

458 

3 

4 

459 

3 

4 

460 

3 

4 

462 

3 

4 

46  4 

3 

5 

46  5 

3 

5 

466 

3 

5 

46  7 

3 

4 

468 

3 

4 

46  9 

3 

4 

470 

3 

4 

471 

3 

4 

47? 

3 

4 

473 

3 

4 

474 

3 

4 

475 

3 

4 

476 

3 

4 

47  7 

3 

4 

478 

3 

3 

479 

3 

3 

481 

3 

4 

CKDV3: 


CKDV4: 


CKDV7: 


CK0V5: 


CKDV6: 


IF  FXPf DP,W2*1 1BI--10B  THEN  00; 

CALL  CREATEGP<0P,11B,W2,CSGRPHT(A8S(FXP<DP,EXP<DP,0I-1»») ) j 
EXP(DP,0I»EXP«DP,0>-10B; 

EXPlDP.EXP(f)PtO)l-SC| 

END? 
IF  EXP<DP,W2*UB»  — IB  THEN  CKOV3A:  OOJ 
HT»EVAL(0P,11BI; 

IF  FXP(DP,HODIHT,W2»»UBI-EX2(HT/W2)  THEN  FXP(  OP  ,  W2*  I  IB  l-HT  | 
ELSE  DOJ 

00  K»OB  TO  W2-1BS 
TMP(K!-EXP(DP,K»liB>T    END; 
M-1BJ 

CALL  FINOBITIN.K.C); 
HT»W2«C*K; 

IF    EXP(OP,0»»100B^W2    THEN    GO    TO   CKOV*J 
1I>ABS(EXP(DP,EXP(DP,0)-1B)I  ; 
IF   CSG*PHTim<«HT    THEN   00; 
GR-MOD(EXP( II,LIMI,W2I; 

EXP(0P,GR*llBJ-EXP<DP,GR*llB)*EX2IEXPm,LIM»/W2H 
FXP<DP,0)«FXPIDP,0»-10B; 
EXP(0P,EXPIOP,0II-SC; 
EXP(DP,W2»11B>  —  1BJ 
GO    TO   CKDV3A; 
END; 
CALL    COPVLINEIOP,HAN*lB,0Bt0B»; 
EXPIHANM  S,K*UB»-EXP<HAN»1B,K*11B»-N; 
CALL    COPYLINE(OP,HANMOB,OB.OB>; 
IF    CSGRPHT(H»N*10B»-CSGRPHT<HAN*IB>     THEN    DO; 
EXP(OP,K*llB|wFXPIDP,K*llBI-N: 
CALL    FlNOntT(N,K,C» ; 

IF    EXP(DP,0»«100B»W2    THFN   GO    TO    CK0V6; 
IF    CSGRPHTI I  I )>W?*C*K    THEN    DO; 

EXP(OP,K*llB»»EXPinP,K*UB»*N; 

GO    TO  CKDV7; 

ENO; 

call  creategpchan*1b,ub,w2,ht); 

exp(han»ib,ii>ad; 

exp(han*1b,w2+100b)-mu; 

expchan«-18,w2*101bi»-ii; 

exp(han*1b,w2«110b)-sc; 

fxp(han*18,0»»w2*110b; 

EXP(HAN*lB,LIM)-OB; 

HT*CSGRPHT(HAN*1B»; 

CALL    COPYLINEIHAN*lBfII,0B,0BI; 

GO    TO   CKDVl; 

END; 
JJP-1B; 
IF    MP>OR    THEN    DO; 

JJ«HP-10B; 


/*  HTEXP  */ 


08 


STMT  LEVFL  NEST 


482 

4 

483 

4 

484 

3 

486 

4 

487 

4 

488 

4 

489 

4 

490 

3 

491 

3 

492 

? 

3 

49  3 

3 

3 

494 

3 

3 

496 

3 

4 

49? 

3 

4 

49  8 

3 

4 

499 

3 

5 

501 

3 

4 

503 

3 

4 

504 

3 

4 

505 

3 

4 

506 

3 

3 

507 

3 

3 

50  8 

3 

3 

509 

3 

3 

510 

3 

3 

512 

3 

1 

514 

3 

1 

515 

3 

I 

516 

3 

1 

517 

3 

1 

518 

3 

I 

52C 

3 

2 

521 

3 

2 

522 

3 

2 

523 

3 

3 

524 

3 

3 

52  5 

? 

3 

526 

3 

3 

527 

3 

2 

528 

3 

2 

529 

3 

1     MDCMP2 

530 

3 

1 

531 

3 

I 

532 

3 

1 

533 

3 

I 

534 

3 

535 

3 

536 

3 

53  7 

.3 

CALL    COPYGRPSII.HANMOB,  JJ.JJPI  J 
END! 
IF    TERMS>0B    THEN    00; 

JJ-TL0CU>-1B> 

CALL    C0PYGRPSII,HAN*10B, JJ.JJPll 
EXP(H»N*108,JJP-10BI-Mlj; 
END; 
EXPIHAN+10B, ll-AD; 
EXPIHANMOB,  JJP1-SC; 

EXPtHAN*lOB,01«JJP: 

EXPIHAN^IOB.LIMI-OB; 

IF    EXP(HAN+lBtL!M|>CSGP.PHT(HAN»10BI*MGHTm    THEN 
CALL    OVEVAL<HT»; 
FIN-OB; 

00   K-11B    TO   W2M0B; 
FlN»FtN*EXP<HAN*18,KI;         END; 
IF    FINN  OB  THEN    EXPt HAN*lBtKI--10B; 
CALL    C0PYLINEtHAN»lB,0P,0B,08)l 
GO   TO   CKDV2; 
END; 
CALL    CREATEGP(0P,118,W2,CSGRPHTIDPI): 
EXP(DP,0I*100B*W2; 
EXPIDP,  1 )»A0; 
FXP(DPilOOB*W2)=SC; 
END    CKDV3A; 
IF    EXPIDP, 0)>100B*W2    THEN   GO   TO    CK0V4; 
EXP(DP,LIM)»EVALIDP,11BH 
CALL    DVEVALIEXPIDP.LIMII; 
EXP( I,DVPT1-DV*B20; 
EXP(I,DVPT*1BI*-1B; 
IF    TERMS*1B    THEN    DO; 
IF    MP<OB    THEN 

CALL    COPVLINE(ABS(EXPI  I ,  TLOC(  I  HI  ,  I  .  IB,  EXPU  .Oil 
ELSE    DO; 

CALL    CREATEGP(B19,11B,W2,CSGRPHTIABS(EXP(I,TL0C( 
JJ«1B; 

CALL    A00GRPS(B19,I,JJ,MPI; 
END: 
GO    TO    ELIMT; 
END; 
DP--1B: 
J»TLOC(OI; 
jp-j; 

GO    TO    MOCMPl; 

ENO; 
MP«-1B; 
TLOC(0)»JP; 

CALL  COPYGRPT(I,I,J,JPI  ; 
IF  TERMS-OB  THFN  HP-JP-W2-1B; 


00: 


I 
11)11 


109 


(*    HTEXP    */ 

STMT    LEVEL    NEST 

GO    TO    MDCMP1 ; 
MUL:  IF    EXPI  I  .J  +  IBKOB    THEN   CALL    COPYGRPT  (  I  ,  t  ,  J,  JP» ; 

ELSE    IF    MP<08    THEN    DO; 

wp«jp»ipe; 

CALL  COPVGRPS! I, I, J, JP» J 
ENOt 

ELSE  CALL  AODGPPS ( I , I , J, MP  I ; 
GO  TO  MDCMP1 ; 
OIV:      IF  DP<0B  THEN  DO: 
HAN«HAN*1B; 

op«han: 

EXP(OP,LlM)«OB; 
EXPI0P,LIM-1BI»UB; 
FXPI0P,LIM-10B»»I: 
IF    EXPI  It  J*1BX0B    THEN    DO; 
CALL    ZGRPIDP.11B, AD.W2J; 

expidp,w2*100bi«sc; 
fxpidp,obi*w?«-ioob; 

FXPI  [  ,  J|«Ml); 

fnd: 

ELSE    00; 

EXPU  ,  J)*AD; 

EXP|0P,0BI»1B; 

FNO; 
CALL    COPVGRPS! 1. DP, J, EXP! DP, OB) I; 
EXP!DP,EXP!DP.OB»)»SC; 
fxpi i,jp»«ov; 
ovpt«jp; 

EXPI  I,JP»1B)  — OP; 
JP»JP*10BS 

eno: 

ELSE    DO; 

fxpu,ji«mu; 

IF    EXP!I,J*1BK0B    THEN    DO; 

CALL    COPVGRPS ( I  ,0P  ,  J  ,  EXP! DP,0») ; 
FXPIOP.EXPIDP.OI >"SC; 
FND; 

ELSE    CALL    AODGRPSI  1 ,0P, J, I  IB ) ; 
END; 
GO   TO    M0CMP1 ; 
SMC:  EXPI  I  ,JPI»SC; 

FXPI  I,C)»JP; 

IF    TFRMS*OBEOP<OB    THEN    RETURN; 
GO    TO    AOO; 
ERR:  PUT    OATAII , J, PCI  NT, TERNS tOP , MP! SKIP; 

GO    TO    FRROR; 
ENO    MOCI"P: 

592      2         CSGRPHT:  PROCEDURE  (II  PECURSIVE  RETURNS  IBIN  FIXED! 1511; 
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3 

540 

3 

54  2 

3 

544 

3 

I 

545 

3' 

1 

546 

3 

1 

547 

3 

548 

3 

549 

3 

551 

3 

1 

55? 

3 

1 

553 

3 

1 

554 

3 

1 

555 

3 

I 

556 

3 

1 

558 

3 

? 

559 

3 

2 

560 

3 

? 

561 

3 

2 

562 

3 

2 

56  3 

3 

1 

564 

3 

2 

565 

3 

2 

566 

3 

2 

567 

3 

1 

568 

3 

1 

569 

3 

1 

570 

3 

1 

571 

3 

1 

572 

3 

1 

573 

3 

1 

574 

3 

575 

3 

1 

576 

3 

I 

578 

3 

2 

579 

3 

2 

580 

* 

2 

581 

3 

1 

582 

3 

1 

583 

3 

584 

3 

585 

3 

5  86 

3 

588 

3 

589 

3 

59C 

3 

591 

3 

10 


593 

3 

59  4 

3 

596 

3 

597 

3 

59B 

3 

I 

600 

3 

60  1 

3 

60? 

3 

603 

3 

ASCMP: 

60  5 

3 

1 

607 

3 

1 

ASCMPl: 

608 

3 

1 

609 

3 

? 

611 

3 

1 

61? 

3 

1 

613 

3 

615 

3 

617 

3 

618 

3 

619 

3 

ERROR: 

6?0 

3 

/*    t-TEXP    */ 

STMT    LEVFL    NEST 

DECLARE     tI,J,AP,K,TEMP(0H59H    BINFIXEDI15U 
IF    FXP( I tL!MI-»OB    THEN    RETURN    ( EXP( I , L I  Ml  I ; 
CALL    MDCMPI I.FRRORI; 

00    K«0B    BY    IB    TO    FXPJJjOBJj 

TEMPIKI-EXPU.KI  t         ENO'l 

CALL    FORMIWO.II; 

J-H0»100BJ 

AP-11B; 

IF    FXPI I ,J!-SC    THEN    DO: 

IF    EXP1I.1BI-SU    THEN   EXPf  I  ,  1181  -_EXP!  1 , 1 1  BI*10Bl  /•UNARY    MINUS*/ 

EXPU, LIM»«EvAL!  I.llBH" 

DO  K«OB  BY  IB  TO  TEMPIOBIt 

EXPII ,K»"TEMP|K1;    END: 

RETURNIEXPI I  ,LIM1I : 

END; 
IF    EXP<  I  ,J)  —  EXPU,  1BI     THEN    FXP  (  I  ,  I  8  >-*0l 
IF    FXP(  I , JI-AOIEXPl [ , JI«SU    THEN    CALL    ADDGRPS I  I . 1 1 J, API  J 

ELSE    CO    TO    ERROR: 
GO    TO    ASCMP; 

PUT    EDITC    ERROR    AT    ROW    •  ,  1  II  A(  141 ,  FI  ?  I  I  SKIP  ; 
GO    TO   ASCMPl; 
6?1  3  FNO    CSGPPHT; 


/•BEGIN    BEGIN    BLOCK    STMTS*/ 

/•    E.G.    WEIGHTS:    ADD    WT«SUft    WT;     MUL    WT<DIV    WT;    REQUIRED*/ 
/•    WGHT(4I*MEM    ACCESS   FOR    ATOM*/ 
ON    ERROR    BEGIN;    RFVERT    FRROR;    CLOSE    FILE ISYSPR INT  I ;    END; 
STAND:  BEGIN; 

CN    SUBSCRIPTRANGE    BEGIN; 
REVERT    FRROR; 

PUT    PAGE; 
PUT    DATA(C,R,P,LOC,T,GOROWI : 
PUT    OATAIBKRCW.BKCOLI  ; 

PUT    PAGE; 
DO    11*1    TO    100    BY    1; 
PUT    SKIP    EDIT! (ARYAIIIIf III    DO    1 1  I»  1    TO    ?0 1  I ( ?0    FI6II;    END; 
PUT    PAGE; 

DC    11*1    TO    100    BY    l; 
PUT    SKIP    EOITUARYBI  III, III    00    Ill-l    TO    ?01K?0    F(6I);    END; 
CI  OSE    F  ILEI SYSPRINTI; 
END; 

CCL    I    FIXFD    BINI151; 
DCL     (ARYA,ARYBM20,100»     FIXED    B1NI151; 
OCL    IP.R,C,T,G0ROW,L0C»     BIN    FIXEDI15I; 
nCL(PMSW,MlSW,0IVSWI(30)    BIT11IIN1TII30I0BI; 

p=2;  R*i:  c»i:  gorow»-2;   aryai i,ii»plus: 


6?? 

? 
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6?fl 

3 
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4 
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4 
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4 

635 

4 

636 

4 

638 

4 
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4 

640 

4 

64? 

4 

643 

4 

644 

3 

645 

3 

646 

3 
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3 

648 

3 

/»  HTEXP  •/ 


STMT  LEVEL  NEST 


I    I 


653 

3 

655 

3 
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3 

664 

3 

1 

668 

3 

1 

672 

3 

673 

3 

676 

3 

1 

67<5 

3 

6BC 

3 

683 

3 

1 

686 

3 

689 

3 

1 

603 

3 

694 

3 

698 

3 

701 

3 

703 

3 

708 

3 

709 

3 
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3 

716 

3 

721 

3 

7?2 

3 

727 

3 

730 

3 

731 

3 

732 

3 

734 

3 

736 

3 

737 

3 

738 

3 

740 

3 

742 

3 

743 

3 

745 

3 

747 

3 

749 

3 

750 

3 

751 

3 

754 

3 

1 

75  5 

3 

1 

757 

3 

2 

76? 

3 

? 

763 

3 

? 

76  7 

3 

? 

768 

3 

2 

769 

3 

1 

771 

3 

1 

LPl:       P»P*1:       IF    INSTNGIPI    <   OPAND  THEN 

00;    ARYBIR.CI-INSTNGIPI:    C-C*l;    GO   TO   CLR;    END; 
INSTNGIPI-LEFTP    THEN    DOl    ARYRI R , C 1-GOROW  ;    T-R; 
R  — GCROW;    BKROWIRI-T;    BKCOl  IR1»C*1;    C-1} 
GOROW-GOROW-1;    ARYA ( R  ,C ) -PLUS;    GO    TO   CLR:    ENO; 
INSTNGI PI-PLUS    THEN 
nn;    IF    C    >    I    THEN    PMSMIRI-IB; 

ARYAIR, CI-INSTNGIP);    GO    TO    CLR;    END; 
IF    INSTNGIPI-MINUS    THEN 

00;     IF    C    >    1    THEN    PMSWIR1-1B; 

ARYAIR.CI-INSTNGIP);    GO   TO   CLR;    END;" 
IF     INSTNGIPI-RIGHP    THEN    00;    ARYA IR.C l-SEC; 

C-BKCOURl;    R-BKROW(R):    GO    TO    CLR;    END; 
IF     INSTNGIP1  —  SEC    THEN 

00;    ARYAIR, Cl-INSTNGIP);    GO    TO   CLR;    END; 
ARYAIR, CI-SEC;     INSTNGIPI-ZERO;    GO    TO   ASEM; 
CLR:     INSTNGI PI-ZERO;    GO    TO   LPl: 
ASEM:    P-3;    R=l;    C»l:    LOC-3;     INSTNGI31-PLUS; 
TEXT:     IF    ARYAIR, Cf-SEC         THEN 

DO;    DIVSWIR1-0B;    MISWIR1-0B; 

IF    R-l    THEN    DO;    PMSWID-OB;     INSTNGI P  l-SEC; 
DO    1  =  1    TO    P;       STRNGI I )  =  INSTNGI I  I ;    ENO;    GO    TO    OASEM;    ENO; 
IF    PMSWIR)    THEN 

DO;    PMSWIR1-0B;    INSTNGIPI-RIGHP;    P-P*l;    END; 
C=BKCOLIR»;     R-BKROWIR);    GO    TO   TFXT; 
END; 
IF     MISWIRI     THEN 

IF    ARYAIR, Cl-PLUS    THEN    INSTNGIP l-NINUS: 
FLSE    IF    ARYAIR, O-MINUS    THEN    INSTNG I  PI-PLUS; 
ELSE    GO    TO    TRANS; 
ELSE    IF    0IVSWIR1    THEN 

IF    ARYAIR, CI-MULT    THEN    INSTNGIP l-OIV ; 

ELSE    INSTNGIP1-MULT;  ELSE    GO   TO  TRANS; 

P«P«-1; 
LP?:     IF    ARYBIR.CI     >    ZERO    THEN    GO    TO    POS ; 

IF    ARYAIR, Cl-MULT     |     ARY AIR , C )»0  I V    THEN    GO    TO    MO; 
FLSE    IF    ARYAIR, CI-EXPO    THEN    GO    TO    FXP; 
PM:    T«C*1; 

IF    ARYAIR, Tl-PLUS    I     ARY AIR , T»«M INUS     I     ARYAI R,T 1 -SEC 
THEN    DO;    R--ARYBIR.CI;     C«ll 
IF    PMSWIR)     THEN 
DO:    IF     INSTNGIP-11-MINUS    THEN 

DO;    PMSWIRI-OR;    MISWIR1-1B;    P-P-l ;    END; 
IF     INSTNGIP-1 l«PLUS    THEN 
DO;   pmsmiri-ob:   p-p-i;    ENO; 
GO  TO  text: 
end: 
P-P-l:    GO   TO   text; 
END; 


/•  HTEXP  */ 


STMT  LEVFL  NFST 
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790 
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791 
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833 
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875 

4 

1 

C4SEM: 


ELSE  IF  ARVA!R,T)-EXPO  THEN  GO  TO  EXP; 
ELSE  DO;  R--ARYBIR.O;  C-l; 
IF  -PMSWIRI  THEN  P-P-lBl 

GO  TO  TRANSl  _       

END: 
EXPt    R— ARVB4R.CI;    C-l;    PMSW<RI-1B;    GO    TO    TRANS; 
MD:     T«C*l; 

IF    ARYA(R,T)-EXPO    THEN    GO    TO    EXP; 
R—  ARYRIR.O:    Oil 
IF    -PMSWIR)     THEN 

DO;     IF    INSTNG(P-il-DlV    THEN'OI VSWIR »-l B; 
IF    ARYA|R,0-MINUS    THEN 
IF    INSTNGILOO-PLUS    THEN    INSTNGI  LOO-MINUS; 
ELSE    INSTNGILOO-PLUS; 
GO    TO    LP2;    END; 
TRANS:    IF    Ol    THEN 

DO;    IF    R>1    G    PMSHIRI    THEN 

DO;     INSTNG(P»«LEFTP;    P-P*l; 
INSTNG(PI-PLUS;    LOOP:    END; 
IF    ARVAIR.O-MINUS    THEN 
DO:     IF     INSTNGI PI-MINUS    THEN 

00:     INSTNGI  PI -PLUS;    LOOP;    P«P*l;    GO    TO    LP2;    END; 
IF    INSTNGIPI-PLUS    THEN 

DO;     INSTNGIPI-MINUS;    LOOP;    P*P*1;    GO    TO   LP2;    END; 
IF    INSTNGI  LOO -PLUS    THEN    INSTNGI  LOO-MINUS: 
ELSE     INSTNGILOCI.-PLUS: 
P»P»l;    GO    TO    LP2; 
END; 

IF    INSTNGIP1-PLUS     I     INSTNGI  P  (-MINUS    THEN   LOOP? 
P-P+l;    GO    TO    LP2; 

fnd; 

instngipi-aryairfo  ; 
if  aryair.o-plus   i   arv  air  ,c»«=minus  then  loop; 

P-P*l;  GO  TO  L»2; 
POS:  INSTNGIPI-ARYBI R,CI ;  P-P*l;  OOl;   GO  TO  TEXT; 
BEGIN; 

DCL  II, J, OP)  BIN  FIXE0I15I; 
OfL  S  BIN  FIXEDI15)  EXTERNAL; 

P-2;  1=0;  J-o;  G0R0W--1;  op-plus;  S-0; 
PLl:  P«P«-l; 

IF     INSTNGIP)    <    OPANO   THEN 
DO;    ARRAY! I.JI-OP;    J»J*1; 
ARRAY! I,JI-INSTNG(P| :    J-J-H  ;    GO    TO    PLl;    END; 
IF    INSTNGIPI-LEFTP    THEN 
DO;     ARRAYII  ,J)«OP;     J-J*l; 

ARRAY! I.Jt-GOROW;    T-I ;    I--GOROW; 

bkpowi i+il-T;    bkcoli 1*1 »»J*l;    J«0;   op-plus; 

GOROW=GOROH-l; 

s=s*l ;   GO  to  PLl:       end; 


/*  HTEXP    */ 

STMT    LEVEL  NEST 

878  4  IF    INSTNG(P)»RIGHP    THEN 

87Q  4  no;    ARRAYII,J1«SEC;    J-BKCOLM  +  1); 

88?  4  1                                                               l«BKROW(I«-l);     GO    TO    PL  1 5     END; 

885  4  IF  INSTNG1  PI  —  SEC  IHgN. 

886  4  DO:  OP-INSTNGtP);  GO  TO  PL1;  END; 

890  4  APRAVd,  JI-SEC; 

891  4  FNO  DASEM; 
89?  3  END  STAND; 

891  ?  PUT  EDIT! «HAN«  • .HANI I A, F( 5 » I ; 

894  2  DO     I«0    BY    1    TO    19:  _             _ 

895  ?  1          PUT    SKIP    EOIT(I,ARRAVU,0>,( ARRAY(I.J)     DO"j-f  BV  1    WHILE! 

ARRAY! I,Jl-«0    E ARRAY (I, J)  —  SEC  M    ,SEC) 
(  1  20    I  F  (  4  M  I  ; 

896  ?  1    END: 

897  2  DC  l»HAN  BY  -IB  TO  OB: 

898  ?  I  DP  J«OB  BY  IB  WHILE  ( ARR  AY(  I  f  J  I-.-SEC  »  { 

899  ?  2  EXPIR19, J)»ARRAY( I,J» ;    END; 
901  2  1             FXPIB19, JI=SEC: 

90?  2  1              EXP( I.LIMI-OB: 

903  2  1  J'OB; 

904  ?  1  JP=1B; 

905  2  1    NXTOP:    EXP< I  ,JP1-EXP(B19,J»-E0U: 

906  ?  1  IF  FXPIB19,  JI  —  SEC  THEN  00; 

908  2  2  J'J+IB; 

909  2  2  JP«JP*1B; 

910  ?  7  IF    EXP(B19tJX0B    THEN    EXPI  I  ,  JPt-EXPI  B19,  J ) ; 

912  2  2  ELSE    00; 

913  ?  3  EXPII,JP)»WGHT(100B»: 

914  2  3  EXPI I, JP*1B)»10B: 

915  2  3  00    JP»JP*10B    BY    IB    TO    EXPI  I , JP  )*JP; 

916  2  4  EXP(I,JP)-OB;       END: 

918  2  3  EXPI  1  ,JP)»wr,HT(100BI; 

919  2  3  FNO; 

920  2  2  J*J+1B; 

921  2  ?  JPxJP*lR; 
92?  2  2                                             GO    TO   NXTOP; 

923  2  ?  ENO: 

924  2  1  FXPU.OBI-JP;                    END; 

926  2  MHT=CSGRPHTIOB); 

927  ?  FXP(0B,LIM-1B1«0B; 

928  2  FXP(0B,LIM-1CB»-0R; 

929  2  PUT  EDIT!"  HFIOHT  =  • ,MHT I  I A ( 9» , F (3 )> SK I P; 

930  2  DO  1*0  BY  1  TO  HAN; 

931  2  1  PUT  EOITII t IFXP( !,J)  DO  J» 1  BY  I  TO  EXP(I,0B»), 

932  2  1  (EXP(I.J)  DO  J»LIM-2  TO  L IM ) |( FI 3 » , X 1 2 1 , ( 160JF ( 51 1  SKIP:  END; 

933  ?  FND: 

934  1  FND    HTEXP; 
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AGGREGATE  LENGTH  TABLE 


STATEMENT  NO 

IDENTIFIER 

23 

ARRAY 

645 

ARYA 

645 

ARYB 

18 

RKCOL 

18 

BKROW 

268 

CLS 

647 

OIVSW 

23 

EXP 

3 

INSTNG 

647 

MISW 

?68 

NH 

145 

HP 

3 

OLSTNG 

647 

PMSW 

2 

ST&NG 

■*2 

TABLES 

5  93 

TFMP 

145 

Tl  nc 

89 

TMP 

145 

TVP 

20 

WGHT 

LENGTH  IN  BYTES 

6400 

4000 

4000 

60 

60 

19 

4 

6  40C 

400 

4 

6 

120 

400 

4 

400 

40 

320 

20 

20 

20 

10 
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